DevOps in Business Intelligence and Analytics

What is DevOps?

As the term DevOps itself implies, its goal is to bridge the gap between (software) development and (IT) operations. This is especially challenging since the goals of these two areas are heavily conflicting. The operations team is usually interested in providing stable and reliable services and infrastructure. Therefore, they avoid risks and try to work as predictably as possible. The development team, in contrast, has to respond to rapidly changing business requirements and is interested in fast cycles in order to quickly deploy their changes to production systems (c.f. Figure 1). DevOps tackles these challenges by establishing a culture and processes that tear down the silos.

Figure 1. DevOps tries to bridge the gap between development and operations

DevOps has its roots in the agile community and is often discussed in relation to agile software development and approaches to automate the software build process like continuous delivery. However, it is important to realize that DevOps is rather a philosophy than a method, a framework or a certain technology. Or to be more concrete: “DevOps is the philosophy of unifying Development and Operations at the culture, system, practice, and tool levels, to achieve accelerated and more frequent delivery of value to the customer, by improving quality in order to increase velocity” [1]. The three levels in this definition clearly illustrate the holistic character of DevOps that spans over things like behavior, trust, responsibility (culture), defined processes, metrics and governance (practices) to shared technologies and platforms for cooperation (tools).

Besides agile and holistic thinking, continuous delivery has turned out as one of the key concepts in DevOps. The bottom-line of continuous delivery is that software should be built in such a way that it can be released to production at any time [2]. To achieve this, you need on the one hand a unifying culture between development and operations (DevOps in the broader sense), and on the other hand, a streamlined Deployment Pipeline that defines processes and automates the delivery process.

A deployment pipeline is usually customized to the specific scenario, however, most of them comprise stages similar to the following [3,4]:

  1. Build Automation and Continuous Integration: This stage clarifies how new features are integrated into the source code. Thereby, this stage usually comprises clear processes supported with version control as well as automated building and testing, i.e. automated tests that ensure that a system works at the technical level.
  2. Test Automation: This stage comprises comprehensive testing to assure that a system is working on a functional and nonfunctional level and that it meets user requirements. For this it involves automated (e.g. Acceptance test automation [5]) and manual testing. Moreover, there are usually certain metrics that ensure that no feature is deployed that hasn’t been extensively tested.
  3. Deployment Automation: The last stage involves an automated rollout of new features to production environments. The requirement in this stage is that each deployment has been tested and monitored in a staging environment (a one-on-one clone of the production environment) and can be rolled out to production without any manual intervention.

Each of these stages are supported by various tools and processes that help to automate as much of the deployment pipeline as possible. However, an even more important advantage are the rapid feedback cycles that helps to quickly realize if something goes wrong. In summary, it can be said that the overarching idea of DevOps and the need for a streamlined deployment pipeline go hand in hand.

Benefits of DevOps

The benefits of establishing DevOps can be manifold.

Improve collaboration and communication: This is probably the most significant impact of establishing DevOps, as it constitutes the foundation for all other benefits below. Besides the formal improvement of communication and collaboration, DevOps can also help to better understand different perspectives and exchange knowledge, which reduces conflicts, results in faster resolutions of issues, and, last but not least, improves the working atmosphere and employee engagement.

Reduce time from idea to production: A streamlined deployment pipeline and clear development, testing, and rollout processes can significantly reduce the time from an idea to an actual feature in the production environment. Moreover, the idea of continuous integration and deployment promotes incremental changes and banishes procrastination of features to the next “big” release. This increases agility and allows to quickly react on changing business requirements which in turn leads to a higher user acceptance.

Improve quality and reliability: Besides the reduction in time, a defined deployment pipeline also reduces complexity and ensures a certain level of quality. There will be no features in production that haven’t passed through each stage and that aren’t extensively tested. Only the automated test already increases quality by checking for common errors after each change. Moreover, the high degree of automation reduces the risk of manual errors and unpredictable consequences.

DevOps applications in Business Intelligence and Analytics

Most of the time, DevOps is discussed in the context of software development. However, DevOps also holds potential in the business intelligence and analytics sector. This section briefly illustrates two use cases coming from data warehouse management and advanced analytics.

1. Data Warehouse Management

The data warehouse (DW) is a central repository for enterprise data and hence a key element of a business intelligence architecture. It usually extracts and stores data from various sources in order to transform and provide data for reporting and analytics. Consequently, DWs are usually rather sophisticated solutions and the management of a DW can be challenging. Moreover, changes in a DW are often very slow since they have to be approved by many stakeholders, and the subsequent deployment process is often very complicated and involves manual interventions.

Establishing DevOps in this context can reduce complexity and improve manageability by bringing all involved parties together. For instance, a data engineer can discuss planned features with the database administrator to prevent unwanted consequences in an early stage. At the same time, the database administrator can prepare his environment for upcoming changes. Besides, a streamlined deployment pipeline, as described above, can also speed up changes in a DW by defining and automating processes. There are many approaches, mainly coming from data warehouse automation, to build efficient BI deployment pipelines that help to automate source analysis, testing, documentation and more [6]. Lastly, automated testing (e.g. automated regression and workload tests in a staging environment) can be extremely helpful to deal with complexity in a DW and avoid unforeseen behavior.

2. DevOps and Advanced Analytics

Advanced Analytics subsumes more sophisticated techniques to (semi)automatically analyze data that usually goes beyond traditional BI methods (e.g. machine learning, text-mining and many more). Here, data scientists examine data sets and develop models and algorithms to get deeper insights, make predictions or generate recommendations. Interestingly, the process of building models and algorithms often happens quite separately from the later application. A common procedure is that data scientists build and train their models with selected test data sets, and then deploy them to a production environment to see what happens. This often leads to good results for temporary or one-time analyses, but when models outgrow their temporary role and become a vital part of the business, this approach is often not sufficient anymore. This is where DevOps-thinking (or some call it AnalyticsOps [7]) comes into play to establish a holistic perspective on analytics focused on quality and continuous improvement. According to the methods discussed above, this encompasses defined roles, processes, quality standards and a clear analytics deployment pipeline. Here, DevOps again assures quality and increases speed, but even more important, it can help to get advanced analytics out of the “magical data science” corner and establish well-defined standards and long-term solutions that assert ROI.

Summary

The article shows that DevOps is not a tool or a certain method, but rather a holistic philosophy of bringing development and operations together to improve quality, increase speed and establish a culture of continuous improvement. Accordingly, the adaption of DevOps requires an integrated approach that involves cultural changes, defined processes and measurements, as well as the right tools and infrastructure.

We see that, like for many other sectors, DevOps holds great potential for business intelligence and analytics. Especially the management of complex DW solutions can benefit from a holistic approach and a defined deployment pipeline. However, there are many more other applications, like advanced analytics, where DevOps can help to generally establish standards and assure quality.

In summary, DevOps should not be reduced to its promotion of automation or its defined processes. It rather should be seen as an approach to create an environment that is focused on efficiency, quality, interdisciplinarity and continuous improvement.

Further reading

[1] (2017) Rob England: Define DevOps. What is DevOps?
http://www.itskeptic.org/content/define-devops

[2] (2013) Fowler, Martin: Continuous Delivery
https://martinfowler.com/bliki/ContinuousDelivery.html#footnote-when

[3] (2010) Humble, Jez, and David Farley. "Continuous Delivery: Reliable Software Releases through Build."
http://www.informit.com/articles/article.aspx?p=1621865

[4] (2014) The Continuous Delivery Pipeline — What it is and Why it’s so Important in Developing Software
https://devops.com/continuous-delivery-pipeline/

[5] (2010) Vingrys, Kristan: Acceptance test automation
https://www.thoughtworks.com/de/insights/blog/acceptance-test-automation

[6] (2015) Eckerson, Wayne W.: Which Data Warehouse Automation Tool is Right for You?
https://www.eckerson.com/register?content=which-data-warehouse-automation-tool-is-right-for-you

Julian Ereth

Julian Ereth is a researcher and practitioner in the field of business intelligence and data analytics.

In his role as researcher he focuses on new approaches in the area of big...

More About Julian Ereth