A Guide to Maximum Data Lake Value
Building and operating a data lake is a big investment with uncertain ROI. The processes to put data into a data lake don’t create value. They only create databases. Getting to value requires a plan that embraces the adaptability, agility, trustworthy data, and smart automation—all enabled with the right technologies.
Driving value from a data lake occurs at three levels. At the data management level, the data lake must support all types of data regardless of form and format, must be optimized for sharing and reuse of data, and must be built with adaptable architecture. At the usage level, the data lake must support all use cases from self-service analytics to advanced analytics and data science. At the business impact level, a valuable data lake extends beyond decision support to become a driver of innovation and digital transformation.
The challenge lies in building and operating a data lake that satisfies value prerequisites at all three levels. Getting there requires several critical features and capabilities. Change data capture (CDC) is important to ensure timely capture of data. Data pipeline automation ensures timely delivery of data and agility needed to adapt to changing data and changing business needs. A data marketplace built on smart data catalog technology provides the features needed to find, evaluate, access, and use data. Ingesting data into a data lake creates opportunity for value. Turning opportunities into realities depends on capabilities to move data efficiently through pipelines, adapt to continuous change, and use data to drive business impact.