The Future of Data Warehousing: Integrating with Data Lakes, Cloud and Self-Service
Recent developments in data management—self-service, big data, data lakes, NoSQL, Hadoop, and thecloud—raise questions about the role of the data warehouse in modern analytic ecosystems. Though pundits have declared the data warehouse dead, most organizations continue to operate at least one data warehouse, with the majority operating two to five, and expect to do so for the foreseeable future. Data warehousing is alive, but perhaps not alive and well.
Legacy data warehouses must modernize to fit gracefully into modern analytics ecosystems. They play an important role in data management as an archive of enterprise history and a source of carefully curated and highly integrated data for a broad scope of line-of-business information needs. To continue filling that role well, they must evolve both architecturally and technologically. Yet in many instances, data warehouse evolution is stalled due to uncertainty about what, how, and when to change.
This report provides guidance to break the logjam and begin moving to data warehouses that are agile, scalable, and adaptable in the face of continuous change. It describes how patterns of architectural restructuring, cloud migration, virtualization, and more can be used to combine data warehouses with big data, cloud, NoSQL and other recent technologies to resolve many of today’s data warehousing challenges and to prepare for the future of data warehousing.