The ingestion and transformation of data from source systems into target systems, using a workflow that is often referred to as a data pipeline.
Data integration means combining two or more pieces of information. Today’s data engineers create and manage data pipelines that connect data from sources—such as files or databases—to targets such as cloud data platforms. They send data through those pipelines to feed operations and analytics. Finance, sales, and marketing applications all consume data, as do analytics projects that range from business intelligence (BI) to machine learning (ML) and other types of advanced analytics. Data integration has four components. First is data ingestion, which extracts batches or increments of data (as well as their schema and metadata) from a source, then loads data to the target. Second, transformation combines, formats, structures, and cleanses data. It might also import or create data models. Third, data engineers manage their environments by designing, developing, testing, and deploying the data pipelines that ingest and transform data. They also monitor, tune, and reconfigure those pipelines. The final component, control, refers to tasks such as provisioning, version control, workflow orchestration, lineage, and documentation.
Technology capabilities are part of what defines modern data integration, so it’s important to understand key characteristics of a comprehensive platform. Modern data integration characteristics and capabilities should align with both technical and business needs.
Data Integration initiatives always include more than just a set of tools. The technical architecture also includes data warehouses and data marts, Data Integration and Data Quality components, dictionaries, repositories, and many other technologies. More importantly, organizations should have a proper BI strategy that goes well beyond an architecture blueprint to include non-technical requirements, alignment with corporate strategy, organizational models, outcome-based priority settings, and a proper roadmap.