Data science leaders recognize the importance of testing and validation. It’s a best practice in traditional software development and one that is being adopting to systematically mitigate AI risk during model development. However, most model testing is conducted ad hoc by individuals or teams, which is time intensive and in-exhaustive. In order enforce standards across their organization, to meet both self-imposed protocols and regulatory requirements, leaders need a way to ensure that all models are rigorously tested and continuously validated to protect against security, ethical, and operational risk. In short, they need Robust Intelligence.
Robust Intelligence delivers end-to-end machine learning integrity through a process of continuous validation that integrates into your existing CI/CD workflow. While we’ve developed hundreds of tests across various categories that run automatically on models in development and production, we recognize that companies may want to add additional, custom tests specific to their industry or use case. Customization empowers data science leaders and individual contributors to ensure that their models are rigorously tested and validated, taking into account the distinct needs of each business and thereby mitigating AI risk.
Below are examples of customization opportunities on the Robust Intelligence platform:
Our custom test support enables users to seamlessly upload a Python file that defines their test definition and incorporates it into our platform. Custom tests allow users to define and incorporate their own specifications of model failures and business criteria. For example, a company that performs audits may have specialized methods of detecting abnormal inputs into a model that would not be in Robust Intelligence’s default test suite. The user may also choose to create a monitor for their custom test, enabling them to track success and failure results even after the model is deployed.
All of our tests have thresholds that determine whether the model is passing or failing, and to what severity. We provide options from the UI of configuring test thresholds by allowing the user to choose the alternatives of “Less Sensitive” or “More Sensitive” depending on the model’s use case.
Custom metrics offers both a way for both technical and nontechnical leads to understand their model performance. Although standard data science metrics such as F1 score and ROC-AUC, while useful for assessing a model's performance from a technical standpoint, may not always provide a clear indication of the impact on the business. Customizable metrics, on the other hand, can be tailored to align with the organization's objectives, enabling leaders to evaluate the effectiveness of their models in the context of the company's overall strategy. For instance, a consumer of the model might only care about the revenue that a model generates on each prediction, and so this metric may depend on a formula that is specific to the way the company is using the model. For custom metrics, Robust Intelligence will report both model’s overall performance on the provided evaluation data as well as the model’s performance on different subsets of the data.
In summary, the customization aspect of our platform plays a pivotal role in bridging the gap between data science and business outcomes. By facilitating the creation of tailored tests and metrics, we enable organizations to derive more meaningful insights and foster a deeper understanding of how their models influence real-world performance.
In addition, our governance dashboard provides a window into all models in production within a workspace, providing a centralized view of all custom and default testing. Health status and variance across time of business metrics are available for models in production to help evaluate model risk against business risk. This provides managers and executives with easily digestible information about each of their models and their associated risks along with the custom information that has been incorporated across the platform. These dashboards make it easy to track model owners and take action on specific projects as needed. This feature is especially useful for tracking key business KPIs related to each of the models in production.
These features offer customizability and extensibility to solve pain points for data science leaders and facilitate effective communication of model and business risks. At Robust Intelligence, we strive to mitigate AI risk and instill model integrity. In doing so, we enable data science teams and facilitate governance and customizability across organizations to operate more efficiently and effectively. To learn more, request a product demo here.