The rapid pace AI development is driven in part by the availability of open source models. Organizations of all sizes are using them to accelerate novel applications of AI; however, in the rush to leverage these resources, the models are not being thoroughly evaluated for risk before being downloaded and put into production. This poses a significant unmanaged security risk for organizations.
Open source models of all types are widely available in public repositories such as Hugging Face, spaCy, PyTorch Hub, TensorFlow Hub, NVIDIA NGC AI software hub, and Papers With Code. These can be great assets to an organization, but just as with any public software source, it is crucial to have third-party tools in place to identify potential vulnerabilities and weaknesses in third-party models.
A free resource to mitigate AI supply chain risk
That is why Robust Intelligence is releasing the AI Risk Database, a free and community-supported resource to help companies assess open source AI models for supply chain vulnerabilities. The database includes comprehensive test results and corresponding risk scores for over 170,000 models, as well as a growing number of vulnerability reports submitted by AI and cybersecurity researchers.
The AI Risk Database is for ML developers, security professionals, and community contributors. It provides a centralized resource where ML developers can investigate the risks tied to specific public model repositories, and where researchers can formally report discovered risks to share. This public resource includes:
- a comprehensive database of open source models that is searchable by name or characteristic,
- risk scores derived from dynamic analysis of the models for security, ethical, and operational risks,
- model vulnerability reports from both automated scanners and human contributors,
- a software bill of materials for public models that includes observability into shared resource files across different model repositories, and
- results of scanning model file resources for file-based anomalies (e.g., arbitrary code execution in pickle, pytorch, yaml, tensorflow, and other file formats).
A brief tour of the AI Risk Database
1) A searchable database for public models
Upon initial release, the AI Risk Database contains public models from the popular repositories in Hugging Face and PyTorch Hub, with more to follow. Public models can be searched for by:
- their name (e.g., <code inline>distilbert-base-uncased-finetuned-sst-2-english</code>),
- a repository URL (e.g., https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english), or
- a package URL (purl) string to reliably identify and locate software packages and that may be tied to a very specific version of the model (e.g., <code inline>pkg:huggingface/distilbert-base-uncased-finetuned-sst-2-english@da4550f926c3cc88b2a4617ce78db8145801a8f5</code>).
In addition, one may search for vulnerability reports filed for a model, or generally search for file artifacts (e.g., <code inline>os.system</code>) that might be contained in model resource files.
2) Security, ethical, and operational risks
For a given machine learning task, a developer may have thousands of models from which to choose. How does one select the right model? Today, the choice is often based on reputation, or the result of a model card (if it exists) for which the author self-reports performance on a dataset or task. The AI Risk Database is a useful resource for corporations and individuals concerned about risks to adopt a third-party verification strategy in which not only model performance, but other risk factors can be included. They include:
- Security risk: How does the model perform to worst-case inputs optimized by an adversary for some specific outcome? These might include worst-case small perturbation inputs by using algorithmic attacks that require only API access to the model (e.g., HopSkipJump or Square Attack from the Adversarial Robustness Toolbox), or targeted homoglyph or synonym attacks for NLP models.
- Fairness risk: How does the model perform for inputs that represent different demographics or protected attributes? Does the model do the right thing?
- Operational risk: How does the model perform on a holdout set compared to its training set? How does it perform under small changes to the input? These might include horizontal flips or contrast changes in an input image, or commonly misspelled words or typos in an NLP tasks.
Evaluating a model requires that we do so on a dataset useful to the model. As such, we only run tests on models for which the public model has reported a dataset that we are able to locate.
In each case, we are leveraging a subset of AI stress tests from the Robust Intelligence platform and Adversarial Robustness Toolbox to measure model sensitivity to these categories that contribute to a score, which measures the pass/fail rate of tests compared to other models we’ve tested (i.e., “grading on a curve”). This enables developers to easily select from among the top-performing models. We welcome contributions from other automated sources to contribute to these scores. Interested parties should contact airdb@robustintelligence.com.
"While public model repositories are accelerating the pace of open source AI development, it’s essential that such models are thoroughly evaluated for risk before use - just as with any public software source. The AI Risk Database and other tools that collate model risks are essential to validate the security and robustness of open source models,” said Beat Buesser, Voting Member of the Technical Advisory Committee of the Linux Foundation AI & Data (LFAI) and maintainer of the Adversarial Robustness Toolbox (ART). “I am pleased by the comprehensive analysis of the AI Risk Database and the inclusion of additional community-supported resources, such as LFAI’s ART to measure model sensitivity to adversarial attacks."
3) Risk report contributions
While risk scores give a coarse ranking of model risk among public repositories, risk or vulnerability reports provide details of a specific risk. Ideally, contributors provide enough context to describe a specific risk with details to reproduce it. Risk reports contribute to the body of evidence about a model’s suitability for high-stakes applications.
Users who are logged into the platform (currently, only GitHub logins are supported) can submit vulnerability reports in Markdown and include reference URLs as additional evidence of the vulnerability. These reports are intended to be easy to fill out and submit by a reporting user. The quality and veracity of user-contributed reports can be voted on by the community through simple voting mechanisms.
To seed the report-generating process, we have included automatically-generated reports from the Robust Intelligence platform. We welcome contributions from other automated sources to contribute risk reports. Interested parties should contact airdb@robustintelligence.com.
4) Software bill of materials for public models
For each model version, a software bill of materials is reported. Each file, including binary model resource files are analyzed. This allows the user to be aware of anomalous artifacts in files
as well as track file re-use across different models.
Currently, in-depth file analysis exists for ML model resources that include pickle, pytorch, numpy, tensorflow, and yaml files.
Initial findings reveal significant vulnerabilities
Using AI Risk Database, we were quickly able to discover several areas of risk in public repositories, for which we give a brief summary below. We will follow-up with additional blogs that detail these risks in more detail.
Risk scores: image classifier models
- 50% of public image classifier models tested fail 38% of simple image transformations tests (e.g., contrast increase/decrease, horizontal/vertical image flip, defocus, etc.).
- 50% of public image classifier models tested fail fail 40% of adversarial attacks.
- Three image classifier models with over 250,000 downloads are in the 36th percentile or less of overall risk score..
Risk scores: NLP models
- 50% of public NLP models tested fail 14% of adversarial text transformations (e.g., adversarial homoglyph substitution, adversarial synonym replacement, adversarial typos injection, etc.).
- Three NLP classifiers with over 75,000 downloads are in the 38th percentile or less of model security score.
File scanning anomalies
We have discovered dozens of repositories with pytorch, tensorflow, pickle or yaml resource files that include unsafe or vulnerable dependencies. These include:
- <code inline>os.system</code>, <code inline>posix.system</code>, <code inline>eval</code> (54 in yaml files alone), <code inline>popen</code>, and <code inline>webbrowser.open</code>. All cases discovered thus far appear to be benign (yaml evaluating arithmetic expressions), or for demonstration or research purposes, including by us.
- Several instances of dependencies that aren’t explicitly bad, but may contain vulnerabilities themselves. For example, https://huggingface.co/coldwaterq/sectest/tree/main depends on <code inline>zlib.decompress</code> for which there are known vulnerabilities.
Much, much more to come
Open and public models are a good thing to accelerate innovative AI applications. However, their safe adoption has often been relegated to only a few large organizations.
“The rapid adoption of AI is driving demand for transparency and model governance,” said Daniel Rohrer, vice president of AI security at NVIDIA. “This growth will require innovative technologies that enable machine-scale review to promote a trustworthy and healthy AI ecosystem.”
With the initial release of AI Risk Database, we aim to provide a practical resource to the community to help mitigate AI Risk for risks that are currently unmitigated in most organizations and projects. We welcome feedback.
The reason we invest in tooling and community collaborations like AI Risk Database is because we believe that a rising tide lifts all boats. As a community, we want to establish an expectation that AI should be maximally understood and derisked before their use in important applications.
We hope you find the AI Risk Database to be incredibly useful. Give it a try by viewing the risk scores for a particular model or submitting a vulnerability report.