Part 1: The Data Warehouse Has Been Resurrected
Last month my friend and colleague Stephen Smith published an article on the demise of the data warehouse. As we crisscrossed the country together I joked about posting a rebuttal in defense of our sacred cow; well all joking aside, here it is.
The data warehouse is not dead, there I said it! But yours may be on life support! In this article, I discuss the pros—and pros of the enterprise data warehouse. I also discuss why your EDW may be on life support, and the reason you may be thinking of pulling the plug.
A word about the Data Lake
This word has been thrown around extensively in the last few years but the concept is far from new. Some call this a Staging area or the space between the E and the T of ETL. The is the first stop your data takes on the train to the enterprise data warehouse. It is dumped in an area that could be a separate data, schema or perhaps neither (although for most implementations I don't recommend this). For the most part, data is “dumped” here first as a direct replica of the source system, which is preferred for reasons I don’t have room to explain. However, because of the “rawness” of this data, it is not advantageous to promote this as “the place” to harness data for your users. Unless you want to play 20 questions every day from your user community on why certain data shows up certain ways (because it hadn’t been denormalized, cleansed or formatted in a way to take advantage of conformed dimensions) then I suggest you erase this option from your memory.
First, allow me to defend my carefully crafted Conformed dimension, and magnificently molded metrics. The data warehouse isn't dead for many reasons, I've chosen to share 3 of them.
1. Enterprise Questions Call for Enterprise Solutions
The questions answered by a good business intelligence (BI) system go far beyond a single domain. For example, what is the correlation between a high performer, the training they’ve received, and the defect rate in their code? Or, what if an airport would like to see the number of flights and weather affect sales of food, retail and parking? Or, a food chain wishes to see the relationship between customer service scores, sales and employee engagement? These questions cross domains and require an enterprise solution to answer the question.
2. Centralized Place for Tools
One of the goals of a BI system should be minimizing the number of places data is not. Having 45 places to look for data causes havoc on businesses, not to mention the drain on IT. Users need a reliable place to get information, when it's time to answer a question, a data safari adventure is not their idea of an adventurous afternoon. The average user is not going to able to write SQL statements across multiple domains against nonconformed, untransformed data, nor do they want to.
One tool - if the data warehouse were to meet its demise, it would give way to a slew of tools, all having to be supported by IT. Having one tool for all users allows for extensive training, robust support, simpler licensing models, easier rollout, controlled upgrades etc. The benefits of a single tool are many in number. Some may say that having a data warehouse doesn't dictate one tool and neither does the absence of an EDW. Both statements are definitely true, however, the absence of an EDW and centralized data repository sends a message to the business that IT may not have a standard as it relates to data access, and absence of a single tool or a standard could give rise to customers diversifying tools since they may not know what other parts of the business are doing. In either case, supporting a veritable plethora of tools creates a whole heap of headaches for IT. And yes, I'm from Texas, we say whole heap, sometimes.
3. Curb Rogue and Shadow IT
Two bad words in the technology arena are “rogue” and “shadow” when it comes to IT. The elimination of the data warehouse could create entire shadow IT regimes.
Access to the data lake or Staging area as a practice for all parties, could encourage those rogue tendencies to show their ugly paws. Why should you care? Remember why the data warehouse came to be to begin with, those two executives showing up to the board meeting with conflicting numbers? Let's not go back there, no more silos.
Give the BI team, something to do. Sounds funny I know, but really, your BI and analytics teams should be building nice dashboards, connecting data points, training the business on the data that’s available and concentrating on building a successful analytics program.
These are just a few of the reasons we shouldn’t just pull the plug on that data warehouse just yet. You may have experienced some frustrations with your data warehouse, and I’m not here to minimize those headaches, I will address some of those in a later blog post. In the meantime, trust that the data warehouse has plenty of life left in those old bits and bytes. Take the points mentioned earlier and keep them in mind as you grow your data warehouse to new and higher heights.