Register for "A Guide to Data Products: Everything You Need to Understand, Plan, and Implement" - Friday, May 31,1:00 p.m. ET

Little Data Needs Love Too!

It’s not the size of your data, it’s what you do with it. (Get your minds out the gutter, I’m serious). These days with all the hype around big data, it’s easy to get caught up into thinking your data isn’t important if it’s not huge. Or somehow, it’s useless if it’s less than a petabyte. When it comes to the reason BI and analytics exist in the first place, the size of the data plays no part at all. To prove this point, I’m going to give you five factors to think about regarding your little data.

1.) Big Data is trying to get small

I heard a guy say this one time and it resonated with me: "No matter how big the data is, it’s trying to get small." Think about it, you can’t consume seven petabytes of data. You would have a problem trying to consume seven PAGES of data. So, we are constantly trying to whittle away at data to get it down to a report, to a dashboard, or to an insight. If your data is small already, just remember big data is trying to be like you. Keep setting that trend.

2.) It’s all about small questions

It’s not about big data, it’s about small questions. Netflix uses a ton of data around user choices including watch and pause patterns, search histories, watch histories, and user choice in films over time, both individually and collectively. They actually do have over seven petabytes of data that they use to effectively answer 1 question: What are you likely to want to watch?

The answer gets delivered in two ways.

a.) Suggestions on the app

b.) New shows produced by Netflix based on user preferences 

Keep in mind, it’s the small questions that need to be answered, no matter how big or small your data is. You may spend a ton of time using R and coming up with a predictive analytics algorithm, but the client just wants to know how much we should order Friday. Don’t lose sight of that.

3.) Easier to work with

Small data is better in some regards. No long ETL routines that run for hours and hours and no space issues associated with storing, backing up, or caching data. You can also keep more history without issue. I once oversaw a data warehouse that, in total, was about a half a terabyte, but it had over three dozen sources and supported a ton of users in various departments with about 20 years of history. So, compare that to the fact that I once worked on a team with a huge database that had less than 50 queries a month. Even though both the company and the team were three times the size and the data warehouse was huge, almost no one was using it.

4.) Still start with the business

There is no substitute for business value. Data, big or small, starts and ends with the business. It’s created by the business, it’s consumed by the business, and it benefits the business. If the data is small enough you should be able to manipulate it quicker and deliver value faster. Think BV - business value. The business doesn’t care one way or another about the size of the data. Have conversations to find out their needs and deliver. If you run into situations where you don’t have the appropriate types of data to answer the question, work with the business to capture it. But focus on the right type of data and not just an abundance of data.

5.) Don’t buy more than you need

There are all types of tools available in the marketplace, and I’ve seen some pretty slick ones. Keep your end goal in mind though. If your moving 59 rows of data a day, you probably don’t need a 400k ETL tool.  Vendors are great at making you believe you need a certain tool or solution. To a hammer everything is a nail, so be careful. While there are some great vendors and tools available, if your users are currently running the entire business from a five-column spreadsheet with 76 rows, I wouldn’t be looking to implement a blockchain solution any time soon.


In the end, it doesn’t matter to the business how big, wide, and unwieldy your data is. They don’t care, and they shouldn’t care. They just need a problem solved. I’m not saying stop going to conferences and reading vendor emails, but I am saying know your space, know your data intimately, and be familiar with the types of business questions and organizational problems that can be solved using your data, big or small.

DeWayne Washington

DeWayne Washington is a senior consultant with 20+ years of experience in BI and Analytics in over 2 dozen verticals. He is the author of the book More About DeWayne Washington