New Year Reflections and Predictions by the Eckerson Group Team

2018 was a busy year for the data analytics industry.

On the technology side, Hadoop imploded, the cloud soared, and data lakes sunk into the trough of disillusionment. Companies built “data pipelines” to support multiple use cases rather than just ETL jobs. Data catalogs grew from a self-service novelty to a data management requirement. Data preparation evolved from stand-alone tools to embedded functionality in every data and analytics product. BI tools beefed up their data capabilities, and data platforms embraced analytics. Finally, AI got automated and went downstream in search of citizen data scientists. 

On the business side, organizations chased a digital strategy and hit a data wall. As a result, they hired Chief Data Officers, who immediately got pinned down by GDPR and cybersecurity issues. Companies began modernizing their data architectures, embracing cloud-first, streaming-first, and mobile-first principles. Many hired teams of data scientists, and those who didn’t, upskilled data analysts to build analytic models. Many companies embedded analytics in customer-facing applications, while some started charging for it, turning data and analytics into a profit center. 

The New Year

2019 shows no signs of slowing down. Eckerson Group assembled its big minds for some deep thought about the coming year. We concluded that the rate of innovation will continue to accelerate, and a few older technologies will make a cameo appearance. Here are some of the key trends and ideas that will dominate 2019. 

1.  Data Warehousing Becomes Sexy Again

It seemed that Hadoop and data lakes dealt a death blow to data warehouses, which got relentlessly pummeled by the big data cognoscenti for almost a decade. But surprise, surprise—data warehousing is back! It seems that having a centralized store for clean, integrated, historical data isn’t such a bad idea. The cloud is a big reason for the resurgence of the data warehouse. New cloud data warehouses from Snowflake and the cloud platform providers (Amazon, Microsoft, and Google) and figured out how to bring the best of big data—a built-in staging area, decoupled processing and storage, JSON support, elastic scalability, and open APIs—to the once-stodgy world of data warehousing. It didn’t hurt that Hadoop imploded and data lakes became data swamps. It once again became safe to utter “data” and “governance” in the same breath. This time, hopefully, people will realize that a data warehouse is just one pipeline among many and won’t try to shoehorn all data analytics activity (i.e., self-service, data science, data lakes) into it.  (Wayne Eckerson)

2. Self-Driving Data Becomes a Reality

We hear a lot these days about self-driving cars and can be sure that in time autonomous vehicles will disrupt and reshape transportation. Today many forward-thinking data management and analytics vendors are working toward a vision of self-driving data. Similar to a self-driving vehicle that knows location, destination, and routing, self-driving data can know it’s location and destinations, navigate data pipelines, and deliver data wherever it is needed. Data management vendors who are introducing AI and machine learning into their products are moving rapidly along the path to making self-driving data a reality. (Dave Wells)

3. The Role of CIO Starts to Disappear

As the world of information management evolves the role of CIO begins to lose significance. CIOs have traditionally been strategically responsible for data, information, and technology management. That worked in a simpler time when the scope of each was limited. Today’s world is more difficult with added complexities of big data, self-service analytics, data science, open source, cloud, and much more. Data strategy is quickly becoming the responsibility of the Chief Data Officer (CDO). Information strategy increasingly falls in the domain of a Chief Analytics Officer (CAO). Technology strategy becomes the responsibility of a Chief Technology Officer (CTO). Data and information are recognized as strategic assets. The expanding scope of data and information management quickly becomes too large to assign to a single role. As new CxO roles continue to emerge, the all-encompassing CIO role will become obsolete. (Dave Wells

4. Multi-Cloud Environments Go Mainstream

Enterprises move to the next logical phase of cloud adoption: dual- or even tri-cloud platforms. Changing workloads, changing business requirements and a steep learning curve prompt data teams to continuously re-evaluate their choice of Cloud Service Providers (CSP) and strike a balance among multiple parties. This emerging strategy gives negotiating leverage to cloud customers as they shift data and workloads, improving the medium- and long-term cost savings of cloud investments.  Enterprises will progressively improve production and analytics application performance, applying workloads to carefully-selected processing platforms.  Multi-cloud adoption also drives up the need for cloud experts within the enterprise, which dampens some of the cost benefits. (Kevin Petrie

5. DataOps Goes Mainstream

Most organizations have identified data as an important asset to their business. However, for most organizations, the process of extracting, cleansing, transforming, modeling, formatting, and moving data is byzantine. There are too manual steps, too much rework, and too many errors and inconsistencies. Business users get frustrated by the slow pace of delivery and data people get burned out from trying to tame their ticket feeds. DataOps applies the concepts and principles of DevOps to the data space. It insists on agile development methods, a testing-first approach, and a more systematic and automated way to deploy new functionality and code changes. The result is a more agile, accurate, and automated data development process that accelerates delivery and minimizes waste and rework. (Julian Ereth and Wayne Eckerson).

6. A majority of companies will move from data science artisanship to data science automation.

In the data science (DS) maturity model that runs from experimentation, to artisanship, to automation, to operationalization we will see many companies move from artisanship to automation. The reason for this is fourfold: 1. Tools are getting better with automated machine learning and model management. 2. The DS honeymoon is over and many companies that have invested will want to see repeatable and measurable value. 3. There are growing numbers of data scientists and citizen data scientists to fulfill the needs. 4. General improvement in parental discipline in Hadoop-inspired data lakes that makes data safer and more useful for building AI and machine learning models. (Steve Smith) 

7. A Streaming-First Architecture Becomes Standard

Timeliness has always been an essential factor for data analytics and will become even more important in the future, as organizations embrace use cases beyond strategic decision support, such as real-time and autonomous analytics. Traditional batch-oriented architectures are not designed to handle data streams or extract, process and present data in real time. In contrast, streaming architectures, support both batch and streaming data. These streaming-first architectures can be gated to batch events in a queue and then process them at periodical intervals. This is why we will see a shift in architectural patterns towards streaming-first approaches that will increase flexibility and pace of analytics and enable new use cases. (Julian Ereth).

8. Companies Require Ethics Reviews for AI Algorithms mainstream

Some industries, such as the financial services industry, already regulate analytic models that approve customers for credit cards, loans, and other products. But more regulation is coming. Besides ensuring accuracy and transparency, new regulations will require organizations to evaluate and monitor analytic models for innate bias and discriminatory behavior. It’s been demonstrated that facial recognition software will fail to identify people of color if developers, who are mostly young, white males, used themselves and cohorts as training data to feed the models. Companies will need to establish review boards to ensure overt or inadvertent bias does not creep into models, especially those used to interact with the public. 

9. Analytics and Application Architectures Converge

Increasingly, organizations want to embed analytic output into other applications. This includes visualizations, dashboards, self-service analytic environments, and analytic models, such as recommendation engines. In addition, most development teams are embracing DevOps and microservices while moving to a Software-as-a-Service model in which applications leverage a host of cloud-based services via Java and REST APIs. 

Unfortunately, most BI and analytic products are designed as monolithic applications that run on a separate application server (or servers) in an on-premises data center. They are not lightweight, easily componentized, or accessible in part or whole via APIs. To remain relevant to application developers, BI and analytics vendors will need to deconstruct their products into lightweight services that run inside containers. This will enable developers to mix and match operational and analytical functionality in modern, scale-on-demand, cloud-based applications. (Wayne Eckerson)

10. GDPR both Hurts and Helps Machine Learning

Since the General Data Protection Regulation (GDPR) went into full effect on May 25th, 2018, companies have been unsure as to how to limit the usage of data for building machine learning and AI predictive models. A recent report from Talend showed that while 98% of companies surveyed had updated their privacy policies only 30% actually responded to a mandatory personal data request. These results show that the lawyers have done their job, but the data processors are still struggling. Expect this low compliance and struggle to continue for a few more years. After that, though it should make it easier for data scientists to find and use the right data in a privacy-safe way to build their models. (Steve Smith) 

We trust you enjoyed our predictions. Please tell us yours! We hope you had a tremendous 2018 and will have an even better 2019!

Wayne Eckerson

Wayne Eckerson is an internationally recognized thought leader in the business intelligence and analytics field. He is a sought-after consultant and noted speaker who thinks critically, writes clearly and presents...

More About Wayne Eckerson