Keeping Teenagers out of Trouble: The Art and Science of Machine Learning Governance

Read - The Machine Learning Lifecycle and MLOps: Building and Operationalizing ML Models - Part I

Read - The Machine Learning Lifecycle and MLOps: Building and Operationalizing ML Models - Part II 

Read - The Machine Learning Lifecycle and MLOps: Building and Operationalizing ML Models - Part III

Read - The Machine Learning Lifecycle and MLOps: Building and Operationalizing ML Models - Part IV

Parents spend a lot of time governing their kids. If they are vigilant, they reduce the risk of trouble when those kids become teenagers and start doing things on their own.

Machine learning (ML) offers a similar challenge. If you govern how you build, train, and operate ML models, you can reduce the risk of angry customers, public embarrassment, and regulatory penalties. But it takes a lot of vigilance.

Machine learning is a subset of artificial intelligence in which an algorithm discovers patterns in data. ML relies on a model, which is essentially an equation that defines the relationship between data inputs and outcomes. Based on this relationship, the model generates a score that predicts, classifies, or prescribes a future outcome based on data inputs. ML applies various techniques to create this model, including supervised learning, which studies known historical outcomes, and unsupervised learning, which finds patterns without knowing outcomes beforehand.

The lifecycle of a machine learning project entails three phases that we explored in earlier blogs. Each of these phases comprises multiple processes and introduces governance requirements. 

  1. Data and feature engineering. Ingest and transform all relevant input data, define the key features, and label historical outcomes. This introduces the requirements of data quality and data privacy.

  2. Model development. Select the right ML technique and algorithm, then train that algorithm to produce a model. Make that model unbiased and explainable.

  3. Model operations. Implement and operate the ML model, then monitor its performance and accuracy. 

This blog will review the core processes that various roles address in each phase, then define the resulting governance requirements and how to address them. It also cites vendor tools that can help along the way.

The Machine Learning Lifecycle: Phases and Governance Requirements

1: Data and feature engineering

Core process. The data scientist partners with the data engineer to ingest and transform the input data that influenced historical outcomes. For example, they might extract historical sales records from a customer relationship management (CRM) application, clickstreams from their e-commerce site, and product rankings from user review sites. They transform the data, for example by re-formatting columns and joining tables. From this dataset, they derive the features—i.e., the critical, refined input data—that help predict customer outcomes. If they are building a supervised learning model, the data scientist also works with business owners to identify and label historical customer outcomes, for example by assigning labels to their satisfaction level, purchase frequency, or follow-on purchases. Tools such as Tecton provide feature stores to assist with feature engineering, and tools such as Snorkel help programmatically label high volumes of unstructured data.

Governance. Two governance requirements arise during this phase: data quality and data privacy. Enterprises’ ability to meet these requirements shapes the success of all that follows.

Data science teams must ensure data quality and data privacy as they assemble data and define features.

  • Data quality. The data scientist and data engineer must ensure the data they transform and the features they derive have accurate, consistent, and current values. They can control data quality by applying methods used for traditional analytics. This includes creating an inventory of data, cleansing the data, and validating samples with established quality rules. Data engineers should consult with data stewards and other governance officers to apply the right methods and follow them up with periodic audits. This helps ensure the data scientist has an accurate view of the business on which they can build a model. Data quality tools from vendors such as Informatica, Talend, and Ataccama can assist these steps. Tools such as HVR and Qlik also can check data quality during the ingestion and transformation process.

  • Privacy. The data scientist and data engineer should work with internal legal and compliance officers to ensure the privacy of Personally identifiable information (PII). The General Data Protection Regulation (GDPR), California Consumer Privacy Act (CCPA), and other regulations set stringent guidelines about how companies can handle PII such as social security numbers. As with data quality, they can reduce risks by applying methods already used for traditional analytics. This means they should apply the same security policies and procedures as the rest of the business. Data platforms such as Databricks, Azure Synapse Analytics, and Snowflake assist these steps with features such as role-based access controls, data masking, and data encryption.

2: Model Development

Core process. The data scientist selects an ML technique, such as a regression or classification in the case of supervised learning, then develops or (more likely) downloads the corresponding algorithm from an open-source library such as PyTorch, Scikit-Learn, or TensorFlow. They “train” that algorithm by applying it to historical features, and comparing its resulting scores to actual historical outcomes. Did the algorithm’s score accurately classify customer satisfaction levels based on their purchase histories and product rankings? Did it accurately predict whether they would purchase a related product, based on prior purchases and clickstream patterns? 

The training process is iterative. The data scientist applies the algorithm to historical features many times as it adjusts the historical time frame, features, or parameters such as feature weightings. They might change or combine ML techniques. When the scores become sufficiently accurate, they have a final ML model, ready for production. Many tools assist the model development and training process. These include data science platforms such as DataRobot, Dataiku, and Domino Data Labs, and data warehouses such as Snowflake and Vertica—each of which integrates with the ecosystem of ML libraries, tools, and programming languages.

Governance. Two additional governance requirements arise during the model development phase. Data scientists must build models that are unbiased and explainable. As in the prior phase, these requirements have implications for everything that follows.

Data scientists must build models that are unbiased and explainable.

  • Bias. ML models should not show bias by generating different scores for groups of people—e.g., different races or ethnicities—that otherwise have the same characteristics. Customers and the public expect companies to avoid these types of bias, and proposed AI regulations from the European Union might make it mandatory. Data scientists must engage their legal counsel and governance officers to understand their requirements, then inspect algorithms during the training phase to prevent, identify, and/or eliminate sources of bias. Their training data, features, labels, and algorithms should exclude data points related to race and ethnicity. Multiple stakeholders, including data scientists, business owners, and legal and compliance officers, should review the algorithm’s scores before deciding the algorithm is an unbiased model, ready for production. 

Data scientists can use the platforms listed above, or model management and governance platforms such as Datatron, to identify bias by applying algorithms to different training datasets and comparing results. They also can use open-source toolkits such as AI Fairness 360 and FairML to root out bias in features and model outputs.

  • Explainability. Data scientists and business owners must explain to legal and compliance officers what ML models do. They must explain why they defined features, why they assigned labels to historical outcomes, and how the model arrives at a score that relates features to labels. They also must explain the model’s risk, potential errors, and level of uncertainty. All this can be difficult because, by design, ML models identify patterns in data that humans might not identify themselves. 

Data scientists can make models understandable and therefore explainable in a few ways. They can make sure their training data matches the characteristics of production data—meaning that they train their algorithm on real-world scenarios. The real world is easier to explain than an artificial world. Data scientists also can select ML techniques such as regressions and random forests, based on decision trees, that have explainable logic. The tradeoff: more explainable techniques such as these might generate less accurate scores. And less explainable techniques—such as neural networks that combine multiple algorithms—might generate more accurate scores. When data scientists explain a model to business owners and other stakeholders, they should take the perspective of the customer. For example, what prompted them to recommend their customer purchase a given product? 

Open-source tools such as LIME and SHAP, and data science platforms such as DataRobot and Dataiku, help answer questions like these; for example, by measuring the contribution of different features to the model’s score.

3: Model Operations

Core process. Once the data scientist deems a trained ML model to be ready for production, they hand that model off to the ML engineer for operations. The ML engineer collaborates with the DevOps engineer to insert the model into production workflows, such as automated text messages to customers or website popup windows in our e-commerce example. The ML engineer also collaborates with the data engineer to integrate with production data pipelines, and with the ITOps engineer to provision the necessary infrastructure. Together they go live with the model, then monitor KPIs for performance and cost. Data science platforms assist these steps, for example by providing a catalog from vendors such as Informatica that centralizes metadata about various model versions.

Governance. Model operations need more vigilance than any other phase of the lifecycle. The ML engineer must monitor the accuracy of model scoring, which will degrade over time as the business environment shifts and data, therefore “drifts.” When model accuracy slips below-defined thresholds, the ML engineer must pull the model out of production and collaborate with the data scientist to re-train or revise it. The ML engineer and business owner also should consult with legal counsel and governance officers to ensure the production model still meets standards for privacy and bias.

Vigilant parents watch their teenager get behind the wheel of a car with trepidation about the risks, but the confidence they did all they could to ensure a positive outcome. Vigilant data science teams that govern the ML lifecycle can put their model into production with the same confidence.

To learn more about this topic from Eckerson Group’s consultants and researchers, check out our recent webinar, “Can Enterprises Tame the AI/ML Dragon? How to Govern Analytics Models.

Kevin Petrie

Kevin is the VP of Research at Eckerson Group, where he manages the research agenda and writes about topics such as data integration, data observability, machine learning, and cloud data...

More About Kevin Petrie