Predictive Analytics: What Do These Words Mean?

I was speaking with Wayne Eckerson the other day and he suggested that I write a blog post about ‘automated predictive analytics’. “Good idea”, I told him but I already was writing about it except that I was calling it ‘self-service predictive analytics’. To me they seemed like the same things but Wayne pointed out that there is a difference between the two. In fact there are subtleties between many of the terms that are used almost interchangeably these days regarding predictive analytics – which of course is very confusing.

So let’s do a tweet version definition (I’ll try to keep it at 40 characters or less …) of some of these terms with an aim to describe them in a way that it might be understandable by your grandmother (even if she is not a PhD statistician).

Predictive Techniques

  • Statistics – the grandfather of all techniques that seek to find patterns in data that can be used to predict and understand – sometimes unfairly perceived to only apply to small data but in actuality equally powerfully on big data problems.
  • Data Mining – techniques that, when effective, are very similar to standard statistics but also tend to be used on noisier data and comprise more of the latest techniques.
  • Machine Learning – people learn from past experience so these techniques try to have computers (aka machines) learn from the experience stored in their databases. Generally different from data mining in that it is applied to more real-world / unconstrained problems.
  • Deep Learning – a twist on neural networks that focuses on first finding patterns or features in the data and then learning by using those features as input (e.g. detecting lines within an image rather than just working with a bitmap)..
  • Predictive Analytics – techniques and tools for taking historical data and making predictions – generically could be referring to statistics, data mining or machine learning.
  • Artificial Intelligence – this area of research has been working in earnest since the middle of the last century to make a computer or robot generally as smart as a person in real world environments (e.g. autonomous cars, beating grand masters at chess or go).
  • Cognitive Computing – similar to the field of Artificial Intelligence but more inclusive of other technologies and techniques with some stronger emphasis on human / machine interaction (e.g. speech recognition, face detection and sentiment analysis).

Predictive Approaches and Tools

  • Self Service Predictive Analytics – tools that do not require coding and can be operated by a business person via an intuitive user interface. 
  • Cloud Based Analytics – analytics solutions that are hosted somewhere else so that you don’t have to build your own infrastructure - usually accessed through a web interface client
  • Automated Predictive Analytics – a tool that simply requires an input of data with a request for something to predict and it automatically makes predictions for the user. Aka “human-less analytics”.
  • Outsourced Analytics – utilizing data analysts as consultants usually through the cloud. This has more of a dedicated consultant feel to it than just using a community solution.
  • Outsourced Analytics Community – performing analytics of all kinds via a third party matching site that matches problems and data to appropriate analysts (e.g. Experfy or Kaggle)

Analysis Techniques

  • Data Analysis – a very general term that references any use of data to make decisions. This could include predictive analytics, reporting or creation and consumption of Key Performance Indicators (KPIs)
  • Business Intelligence (BI) – similar to data analysis, but often used to refer to IT-managed reporting, analysis, and dashboard tools and more often focused on a business user doing simple analysis.
  • Visual Discovery – self-service BI tools that enable data analysts to query, combine, visualize, and analyze data without assistance from their IT departments.

How Do All These Terms Relate to One Another?

When you look at these terms you can organize them in a number of ways to make some sense out of them. One of the best ways to organize them is in terms of the requirements of knowledge about statistics / data mining etc. 

One Dimension: Who Should Use Them?

If we ranked these terms from most likely to require a PhD in statistics or data science to “easily used by a business user” this would be a good first pass:

  1. Statistics – expert
  2. Data Mining – expert
  3. Data Analysis – expert
  4. Machine Learning – expert
  5. Artificial Intelligence – expert
  6. Deep Learning – expert
  7. Predictive Analytics – expert
  8. Cloud Based Analytics – need to use simple tools
  9. Self-Service Predictive Analytics – need to use simple tools
  10. Business Intelligence (BI) – business user – self-directed
  11. Outsourced Analytics Community – business user - need to talk to data analysts
  12. Outsourced Analytics – business user - need to talk to data analysts
  13. Automated Predictive Analytics – business user – just needs to understand their data and business problem

Another Dimension: Are They Narrow or Broad in Application?

What is interesting is that I’m not introducing a dimension like ‘better’ or ‘more powerful’ as a way to describe these Predictive Analytics options. They are all quite powerful or at least powerful enough for the problems that they are being applied to. But there is another way that they can be looked at: the flexibility that the end user has in using them.  I won’t relist them here as they are basically in the same order as above when they are listed from the most general and flexible to the most focused.

The bottom line is that if you have the ability and focus to bring these tools in-house you will have the most flexibility at applying them and coming up with novel problems that can be solved. For instance if you outsource your PA team you will need to give them very specific instructions about what you want. But your PA professionals may not know your business very well (which is one reason it is beneficial to have a dedicated team if you do outsource). They may also not know your data very well. Small things like: “Is this piece of data collected every 4 weeks or once a month?” could have much larger impact when built into a predictive model. There may also be opportunities to request new data sources that someone who is intimately familiar with your data and business might know.

Final Advice: Start Slow – Build Up

The good news is that most of the buzzwords you hear about Predictive Analytics are not that different from each other as it relates to your business needs. What is important is that you get started now with something small and with an easy to use or automated tool or a fully outsourced offering. As you measure the ROI of these programs it will allow you to decide if you want to start to build a more sophisticated core competency within your company. 

Thanks to Wayne Eckerson who suggested this topic and provided expert feedback on this article. Mr. Eckerson is an internationally recognized thought leader in the fields of business intelligence and analytics.  www.eckerson.com

 



Stephen J. Smith

Stephen Smith is a well-respected expert in the fields of data science, predictive analytics and their application in the education, pharmaceutical, healthcare, telecom and finance...

More About Stephen J. Smith

Books by Our Experts