A Big Step Forward for Data Governance Automation

I strongly believe that there are two indisputable facts about data governance: (1) It is a core and critical part of modern data management. (2) Manual data governance is too slow, too rigid, and too costly; it must be automated.

On July 2nd, Informatica announced a big step forward in the move to data governance automation. Already a leader in the field, Informatica recognizes that governance depends on intelligent use of metadata—a principle put in practice with the combination of Axon Data Governance, Enterprise Data Catalog and the underlying CLAIRE engine for AI/ML powered automation. Now they double down on the power of smart metadata with the acquisition of Compact Solutions and their premiere product MetaDex.

MetaDex Spotlight provides advanced metadata scanners and end-to-end data lineage capabilities including visibility into the dark and obscure corners of data flows—the scripts, custom code, legacy systems, mainframes, multi-vendor ETL, and BI applications that are typically invisible to modern data catalogs and data management technologies. This is big because understanding end-to-end lineage is central to data governance. To govern effectively we must know where data comes from, where it goes, and how it is manipulated and refined along the way. To automate governance, the knowledge of data lineage must be based on metadata that is managed in a data catalog.

Governance automation works best with a single source of lineage metadata—everything that you need to know in only one data catalog. Yet the reality for most organizations today is multiple catalogs. You may have a primary catalog, but also several “bolt on” catalogs such as Azure Data Catalog, AWS Glue, Tableau Catalog, etc. Supporting tool-specific catalogs, yet bringing all metadata together in a single catalog is the vision of Informatica’s Enterprise Data Catalog. Compact’s MetaDex Injector advances that vision with easy, code-free import of metadata from multiple and disparate catalogs.

Ultimately, data governance automation is built upon active use of catalog metadata. Analyzing data lineage, providing transparent views into data pipelines, discovering and tagging critical enterprise data including privacy sensitive data, tracking use of data that is subject to regulatory compliance, and quickly analyzing the impact of changes to data and regulations are crucial data governance processes. When done manually, these processes are time-consuming, labor-intensive, error prone, and expensive. Automating the processes with a combination of AI/ML and catalog metadata just makes sense. That is the path to agile data governance that produces high-quality results quickly and efficiently.

This is exciting stuff and a big step in the right direction. I advocated for data governance automation in a recent report titled Data Governance Automation: Making the Most of Data and Analytics TechnologiesLittle did I know at the time how well Informatica had blazed the trail. Lineage metadata is a natural for them because metadata-based data transformation is part of their heritage reaching all the way back to the early releases of PowerCenter. More than 2 decades later, the vision and innovation continue with the powerful combination of CLAIRE, Enterprise Data Catalog, and MetaDex.

Dave Wells

Dave Wells is an advisory consultant, educator, and industry analyst dedicated to building meaningful connections throughout the path from data to business value. He works at the intersection of information...

More About Dave Wells