Rich Galan: Real-Time Analytics is Necessary and Anomaly Detection is Rad
In this episode, Wayne Eckerson and Rich Galan discuss the obstacles to delivering timely analysis, the problems that large volumes of data create, solutions to those issues, and where BI is headed in the near future.
Rich Galan is a veteran data analytics leader with 20 years of experience in a variety of data-driven organizations. He is passionate about driving data analytics innovation through the use of real-time data, anomaly detection, and artificial intelligence.
He has grown several data analytics teams from infancy to full maturity, delivering insights via web, mobile, video, and real-time platforms. He is currently an independent BI, data, and analytics consultant working at the Rubicon Project, an ad tech company.
Key takeaways:
- Having correct definitions and agreeing on metrics is important for real-time analytics
- Every business can be divided into three metrics: volume, price, and conversion
- Machine learning allows for model adjustments on the fly
- Static thresholds do not work for long, unless they’re binary
- It’s impossible to use analysts to sort through all of a business’s data – AI makes it possible
- AI can provide a starting point for analysis, rather than starting with a hypothesis first
- The 80-20 rule can indicate the health of your business
Below is one question and answer from the podcast
Wayne Eckerson: Now you use this tool (ANODOT), which uses machine learning and real time processing to tackle the intersection of all the metrics and dimensions. How many metrics are we talking about in total? How many can it track on a day-to-day basis?
Rich Galan: They can track millions.
Wayne Eckerson: Millions! Wow.
Rich Galan: Yeah, literally. So we’re sending a little over 100,000 every minute. And that’s going to scale out. That’s only on the business side. We’re going to scale this out to the operational metrics. Business level metrics are more “How is this customer doing and here’s the revenue based off of that.” Operational metrics say, “How is this data center running?”
Or if we have code pushes we do burn-ins. You have machines that live inside of clusters or racks that live inside of a data center. When we do code pushes, we push out on a machine and say “Ok, this code looks fine on this machine, now scale it out to the next ten machines. Ok it looks fine, scale it out to the cluster. Ok it looks fine, scale it out to the data center.” Then, for whatever reason, if the code gets pushed in the middle of the night and it crashes, you get alerted that something happened at the node or machine level. Then, rather than wait till the next day for engineers to pick back up, an alert can be sent to Slack or someone on pager duty. So it’s kind of throwing machines at these potential problems, rather than having humans stay up all night.