The advancement of AI has long outpaced the controls required to keep companies and individuals safe. The use of sophisticated, third-party models makes it even more challenging for companies to ensure that models work as intended. Companies are under pressure to innovate with the latest AI resources, but many are hesitant to release applications built on generative models given the risks. This is why AI risk management frameworks have become a top priority for business leaders.
Leading AI services companies also recognize the need to protect against AI risk. Last Friday, the Biden-Harris Administration announced it had secured voluntary commitments from seven large AI companies to adhere to specific AI risk management measures. This agreement serves to accomplish two goals: increase the public’s confidence in AI services developed by “big tech” and serve as a benchmark for what enterprises should require from their own AI systems, including the vendors in their AI stack.
While many were expecting the White House to announce enforceable AI risk management standards that would apply broadly, this is a significant step forward on the path to regulation in the United States: “There is much more work underway. The Biden-Harris Administration is currently developing an executive order and will pursue bipartisan legislation to help America lead the way in responsible innovation.”
Commitments to the Development and Use of AI
The voluntary commitments center around three principles: safety, security, and trust. Let’s review the agreement by Amazon, Anthropic, Google, Inflection, Meta, Microsoft, and OpenAI - and many other companies by extension. The companies pledged to keep the following commitments in effect until formal regulations catch up:
Safety
(1) Commit to internal and external red-teaming of models or systems in areas including misuse, societal risks, and national security concerns, such as bio, cyber, and other safety areas
What this means: Red teaming involves simulating attacks on AI systems in an effort to identify and subsequently mitigate vulnerabilities. The ability to automate a wide variety of attacks and analyze the results is needed to be effective. The companies agreed to testing by independent experts.
(2) Work toward information sharing among companies and governments regarding trust and safety risks, dangerous or emergent capabilities, and attempts to circumvent safeguards
What this means: Information sharing is already a best practice in the cybersecurity domain. Information Sharing and Analysis Centers (ISACs) were formed subsequent to a 1998 Presidential Directive, and a formal process for reporting common vulnerabilities and exposures (CVEs) has been in place since 1999. Some measures have already been adopted for AI systems. The companies have agreed to share information on AI risk across government, the private sector, and academia.
Security
(3) Invest in cybersecurity and insider threat safeguards to protect proprietary and unreleased model weights
What this means: Model weights determine the influence that various inputs have on a model’s output, the manipulation of which can have a significant impact on model accuracy. The companies have agreed to treat unreleased model weights as core IP, applying the same safeguards and protections.
(4) Incent third-party discovery and reporting of issues and vulnerabilities
What this means: If red teaming exercises are run periodically rather than continuously, it’s possible for vulnerabilities to go undetected for long periods of time. It’s also possible that a company’s internal security exercises may miss certain vulnerabilities. The companies have agreed to roll out bug bounty programs that incent outside testing and reporting of weaknesses in AI systems.
Trust
(5) Develop and deploy mechanisms that enable users to understand if audio or visual content is AI-generated, including robust provenance, watermarking, or both, for AI-generated audio or visual content
What this means: It’s very difficult, if not impossible, to accurately attribute AI-generated content. Today, no general technical method has been widely deployed to track generative AI output. The companies agreed to contribute to the development of a technical framework to help users distinguish between AI and human-generated content, and if it came from their system.
(6) Publicly report model or system capabilities, limitations, and domains of appropriate and inappropriate use, including discussion of societal risks, such as effects on fairness and bias
What this means: Comprehensive reports will help inform users of the capabilities and limitations of each model. Ultimately, every model will have some level of risk and it’s a best practice to fully assess a model before building on it. The companies agreed to publish reports for any new significant model, which will include safety evaluations, performance limitations, fairness data, and adversarial testing results.
(7) Prioritize research on societal risks posed by AI systems, including on avoiding harmful bias and discrimination, and protecting privacy
What this means: In order to strengthen public trust in AI, the industry needs to ensure that models don’t disclose private information or reinforce harmful biases. The companies have agreed to prioritize AI safety research and proactively manage AI risk.
(8) Develop and deploy frontier AI systems to help address society’s greatest challenges
What this means: There is incredible potential in AI-powered systems to help solve many fundamental society challenges, including the fields of healthcare, climate, and security. The companies have committed to support bleeding-edge AI systems to meet such challenges, as well as ensure that people understand and prosper from AI development.
What Are the Implications for Enterprises?
While this commitment doesn’t directly impact companies apart from the seven that opted in, enterprises should certainly take notice. This bellwether agreement should inform AI risk management strategy, as the Biden-Harris Administration has explicitly stated its intentions to broaden these measures through an executive order and legislation, building on the White House AI Bill of Rights, NIST AI Risk Management Framework, and previous directives.
Here’s how enterprise can apply these learnings to their own AI initiatives today:
- AI Red Teaming - Engage independent experts to test models for security vulnerabilities, preferably using an automated approach that continuously validates AI systems. Ensure that newly disclosed vulnerabilities and attack vectors are included.
- AI Risk Assessment - Run comprehensive tests against models to assess their security, ethical (bias and toxicity), and operational risk profile. This should ideally be performed before building on a new model, but assessing risk continuously pre and post-deployment is a best practice.
- Reporting - Enable security, GRC, and data science teams to assess AI risk by translating statistical test results into clearly defined outputs for various stakeholders. Ensure all models meet security, ethical, and operational standards. Map results to AI risk management frameworks, such as the NIST AI RMF, for compliance and reporting purposes.
How Can Robust Intelligence Help?
Robust Intelligence is an end-to-end AI risk management platform. Through a process of continuous validation, our platform performs hundreds of automated tests on models and data throughout the AI lifecycle to proactively mitigate security, ethical, and operational vulnerabilities. Robust Intelligence gives organizations the confidence to use any type of model and simplifies AI governance.
Enterprises trust our deep expertise and testing methodology to confidently operationalize AI risk management frameworks that meet internal standards, as well as current and future regulatory requirements. We’re trusted by leading companies including JPMorgan Chase, Expedia, ADP, Deloitte, PwC, and the U.S. Department of Defense.
For more on the NIST AI RMF, watch our fireside chat with a key contributor to the framework.