CIO and CDO Strategies for Successful Data Initiatives
Mark Twain is said to have observed that “history doesn't repeat itself, but it often rhymes.” Technology also rhymes. As I write two Eckerson Group reports on streaming analytics this winter, I’m struck that many of its most compelling challenges and best practices apply equally to other technologies. Put differently, streaming analytics rhymes – with machine learning, AI, any data-intensive initiative.
Streaming analytics, as the name suggests, is the study of data “in flight,” meaning programmatic calculations that are performed on data as it is created (typically while it resides in memory) and before it is stored. Organizations across industries analyze data in flight to increase revenue, reduce cost and control risk. They apply streaming analytics to data from many types of events – credit card swipes, website clicks, the turn of machinery gears – anything that emits a data signal. Figure 1 illustrates the process.
Figure 1. Streaming Analytics
Let’s walk through the perhaps-universal challenges facing organizations today as they embark on streaming analytics initiatives. I would welcome your thoughts on how broadly these concepts apply in the world of data and analytics.
Strategic confusion. Data teams can pursue skunkworks projects without clear business objectives or the support of business leaders. They also might assume “faster is always better,” and mistakenly replace historical use cases with real-time use cases, thereby losing valuable insights.
Dysfunctional teamwork. Business, IT and operational teams have different skills, objectives and vocabularies. They often try to keep things simple by going it alone or collaborate begrudgingly with insufficient joint planning and communication. This behavior limits the scope and success of streaming analytics initiatives.
Skills and time. Business leaders request streaming analytics capabilities that their teams might not be trained to manage. Data leaders often lack the development staff and skills to take on complex customization. Data teams often roll out complex, script-intensive streaming systems that confuse business teams. Both business and data leaders can underestimate the need for operational domain knowledge with IoT-related initiatives and fail to collaborate sufficiently with operations experts.
Governance. Business and IT teams are just getting their arms around the implications of the California Consumer Privacy Act (CCPA), effective Jan. 1st, 2020, which forces companies to provide customers with new control of their personal data. Other state legislation in development, and the Global Data Privacy Regulation (GDPR) for European customers, force companies to reconcile different requirements in different geographies. They also must stay adaptable to future change.
Scalability and performance. Data teams struggle to manage the explosive growth of data from myriad sources. Data volumes can easily overload CPU resources or bandwidth, especially with on-premises infrastructure. Bottlenecks increase latency and reduce business value.
Data variety. Data teams struggle to integrate and analyze different data streams. New data sources, ranging from IoT sensors to social media feeds, require new APIs. New data types require new ways to process that data.
Inflexible architectures. Large enterprises in particular often struggle to copy data from legacy sources like mainframe systems or SAP applications. To do this, their data teams might need specialized programming skills or specialized software, that restricts their ability to integrate with other platforms in the future. Data teams also face the risk of lock-in. If they customize their stream processor software on cloud Platform as a Service (PaaS) offerings, they make it harder to migrate elsewhere in the future.
So, how to address these data-related challenges? Organizations embarking on streaming analytics initiatives can reduce risk and cost, and increase returns, by applying the following best practices. I’d also welcome your thoughts on how broadly these concepts apply.
Treat streaming analytics as one arrow in your quiver. While you can transform the aspects of your business with streaming analytics, it will not address all use cases. You still need historical analysis to understand deeper questions and the longer term. Event streaming for preventive maintenance, for example, does not replace root-cause analysis.
Evangelize streaming analytics as part of data modernization. Your CIO, CDO and/or CEO should already understand the business case for data modernization. Evangelize the data modernization attributes and benefits of your streaming initiative. For example, event streaming can replace inefficient legacy batch processes. By framing streaming analytics as data modernization, you can educate sponsors, secure budget and combine resources to improve the odds of success.
By treating streaming analytics as part of data modernization, you can educate sponsors, secure budget and combine resources to improve the odds of success.
Build strong bridges between business and IT. Clearly communicate objectives, requirements, and inter-dependencies across teams. Identify and eliminate any grey areas. Establish a shared glossary and ontology to guide collaboration among data managers, business decision makers and (especially with IoT projects) operations experts.
Stay flexible with your architecture. Change is a constant. You might exceed objectives with one processor or visualization tool this year, but need to switch in new components next year as business objectives change. Prepare for future adjustments by investing in flexible CDC solutions or API libraries that integrate with many types of event sources, event brokers and data stores. Try cloud Platform as a Service (PaaS) processor tools, but carefully scope lock-in risk before customizing on them. You might want to migrate workloads to other platforms down the road.
Embrace Infrastructure as a Service (IaaS) to economize and simplify. Retire, or avoid building, inhouse streaming infrastructure by adopting cloud-based event broker systems. Rather than making a capital expenditure, you pay operating expenses, scaling up or down without fixed cost commitments. You also reduce inhouse effort because IaaS covers deployment configuration, administration and tuning of streaming analytics systems for optimal performance and resiliency.
Align to established governance practices. Engage your governance officer to scope and control the compliance risk that your streaming analytics initiative raises. While GDPR and CCPA are not technology specific, the real-time nature of streaming analytics raises the need to satisfy all compliance concerns before going into production. Carefully design and monitor processing software to avoid non-compliant actions on sensitive data such as Personally Identifiable Data (PII). Re-run this diligence process each time you add a new type of PII to your initiative.
Scope and address compliance risk each time you add a new type of Personally Identifiable Information to your data initiative.
Plan for growth. Plan for user demand to spike if you execute a successful initial project. If you reduce fraud risk for one line of business, others quickly take note. When it does, you need to be ready to accommodate growth. Build best-case and worst-case forecasts for architecture components, hiring and training requirements to ensure you avoid budget surprises. Consider using IaaS to elastically scale resources and still meet SLAs.
So practitioners, take heart. In an industry full of shiny, confusing objects, we can count on common technology-agnostic principles to guide us along the way.