Two Approaches for Creating an Integrated Data Analytics Environment

As data analytics workflows and ecosystems become increasingly complex, organizations and software vendors alike recognize the need for integrated data analytics environments. It is challenging and time-consuming to hop from tool to tool when building data analytics solutions and monitoring the flow of data to support those solutions.

An integrated data analytics environment makes it easier to create and automate data pipelines and workflows to deliver information to business users. It reduces data silos, enhances security and governance, and eliminates system redundancies. It can also improve process efficiencies, reducing costs and accelerating time to insight.

There are two primary ways to create an integrated environment:

1) purchase an all-in-one platform from a single vendor that contains all the functionality required to build, deploy, and monitor data analytics workflows. Or,

2) purchase an orchestration platform that coordinates the inputs and outputs of myriad stand-alone tools, systems, and applications.

Both approaches aim to deliver the same results – an integrated environment that speeds the delivery of robust, high-quality data analytics solutions to business users, but each has distinct pros and cons. All-in-one platforms simplify the data environment, but at the cost of vendor dependency. Orchestration tools allow for best-of-breed solutions, but don’t reduce the underlying complexity of the software landscape. The real question is how to select the right approach for your organization or use-case.

Picking the Right Approach

The key to selecting the right approach is to know your requirements. Consider the following:

  • Does your organization already have a range of products?
  • How attached are your teams to those tools?
  • How easy is it to migrate existing jobs to a new platform?
  • What is the cost of migrating those jobs in time, labor, and software licenses?
  • Is this an enterprise or departmental solution?

Thinking about these elements will put you in a better position to choose the right integration strategy.

UDAPs. Last year my colleague Wayne Eckerson published a report titled "The Rise of Unified Data and Analytics Platforms: The Case for Convergence". That report was downloaded more than any other report on the Eckerson Group site this year. By consolidating the entire data and analytics stack into a single platform, UDAPs simplify and accelerate the development of data pipelines and analytics solutions while reducing costs. Many start-ups have developed these platforms from scratch, such as Domo and Incorta, while established software companies, such as SAP, Microsoft, and IBM, have created integrated platforms from existing tools.

UDAPs contain all the components required to deliver a data analytics solution, including data ingestion and integration software, a data refinery, data analytics tools, and administrative and management tools. UDAPs create a one-stop-shop for building and consuming data analytics solutions. This consolidation simplifies delivery, administration, and consumption, reducing costs and accelerating time to insight. (See figure 1).

Figure 1. All-in-One UDAP

The drawback to UDAPs is that they tend to be jacks-of-all-trades and masters of none. Although they provide a comprehensive feature set, they may not support every function an organization needs or wants and the functions they do support may not be as rich as those offered by best-of-breed tools. UDAPs are best suited to small or medium-sized organizations new to data analytics or departments at large companies without much legacy software.


UDAPs are best suited to small or medium-sized organizations new to data analytics or departments at large companies without much legacy software. 


Orchestration tools. Just as an artist uses a paint palette to access and blend colors, an organization uses an orchestration tool to access and blend multiple tools, systems, and applications into a coherent workflow. These products facilitate a best-of-breed approach, since they link together whatever tools an organization already has in place. (See figure 2.)

Figure 2. Integration-Oriented Orchestration Tool

Orchestration tools are best suited to companies with lots of existing data and analytics tools that they want to keep using. This means most large companies with existing enterprise data and analytics departments are good candidates for orchestration tools. These tools let organizations have their cake and eat it too – best-of-breed tools and highly coordinated workflows.


Most large companies with existing enterprise data and analytics departments are good candidates for orchestration tools.


This flexibility comes at a price, however. Ultimately, they do not simplify the actual data environment, they just make it easier to integrate. Since orchestration tools generally rely on other applications to directly handle the data, organizations typically can’t replace any of their existing subscriptions. A given orchestration tool may also have limitations on what tools it can integrate. Although vendors try to make them as extensible as possible, it is vital to ensure the product actually works with the tools your organization uses.

Spectrum. In reality, most integration products fall on a spectrum between all-in-one and best-of-breed approaches. Most UDAPs are extensible so customers can plug in select third-party products that offer augmentation or replace built-in functionality. Conversely, some orchestration tools offer data processing or analytics functionality, such as ETL or data quality features so that customers do not have to purchase or use stand-alone tools in every instance. Ultimately, vendors listen to their customers and add functionality they request. This tendency makes products more similar than different, given sufficient time.

Recommendation

In a complex data landscape, there are two main paths to integration. For companies or departments without strong dependencies on legacy tools, UDAPs offer a straight-forward way to centralize the entire data analytics workflow into a single, all-in-one platform. For those organizations pursuing a best-of-breed approach, orchestration tools offer another path. Their ability to manage a plethora of different tools allows greater flexibility in curating the exact feature set that a company desires.

Whether you end up purchasing a UDAP or an orchestration tool, implementation works best when it is done piecemeal. Most vendors recommend picking one or two bottlenecks to start with and transitioning those steps in the workflow to your UDAP or orchestration tool. Then, over time, you can integrate the rest of the workflow into the new platform until it manages the entire process. This technique will allow your users to acclimate to the new system gradually and reduce the risk of breaking production during deployment.

Joe Hilleary

Joe Hilleary is a writer, researcher, and data enthusiast. He believes that we are living through a pivotal moment in the evolution of data technology and is dedicated to...

More About Joe Hilleary