Hybrid Databases for Real Time Online Transaction Processing and Analytics
Conventional data management environments make use of two different classes of database management paradigms – one for transaction or operational processing and one for reporting and analytical processing. While both have relied on the same foundational technology (relational database management systems, or RDBMS), the processing models and requirements of the online transaction processing (OLTP) systems significantly differ from those of the reporting, business intelligence (BI), and analytical applications. OLTP systems are designed to quickly process individual sequences of transactions, while analytical/BI systems must have the data organized in a dimensional model to support querying, analysis, and “slicing and dicing” the data.
Traditionally, no single database instance was able to simultaneously support both activities, so these types of systems are segregated. But splitting these functions creates additional overhead, as it requires additional software systems, processes and related activities to perform ETL (Extract, Transform and Load) in order to export, convert, and deposit the data into the isolated environment (such as a data warehouse, or increasingly a data lake) to enable reporting and analysis. Data warehouses rely on traditional RDBMS platforms, and the dimensional models are aligned to support the performance levels in accordance with the frequent and complex business analysis queries.
A developing challenge to this conventional approach is that the emergence of a plethora of real time systems and platforms that produce continuous streams of structured and unstructured data from IoT (Internet of things), sensor data, automated content generation systems, and social media applications all have whetted analysts’ appetites for faster ingestion of greater data volumes for real-time analytics describing operational activities and dynamically predicting fluid and personalized interactions.
In the conventional data warehouse model, though, the latency between the time that a transaction has occurred and when it is extracted, transformed, and loaded into the data warehouse prevents any immediate analysis that can influence actions in real time. In essence this model limits any options that would allow the use of the data for complex analysis that might provide useful insights to inform someone of an immediate manual action or to trigger an automated reaction to a customer interaction. And as organizations seek to differentiate their customer centricity initiatives by establish strong customer relationships that are supported by automating reactions to customer interactions though system touch points, there is a need to eliminate this latency and facilitate more immediate analytics so that opportunities for customer engagement and revenue generation are not missed
That being said, there are developing options to finesse this impediment. The dramatic cost reduction in a variety of fast memory storage technologies such as RAM and solid-state disk (SSD) coupled with recent innovations in database architectures has enabled the advent of newer database technologies that have the ability to support both transaction processing and real time analytics processing in a single database engine.
Several terms are being used to describe this general trend in databases that support hybrid workload requirements within a single logical database. These databases can satisfy the transaction ordering and consistency requirements of an OLTP database with ACID (Atomicity, Consistency, Isolation, Durability) compliance, but also align the data in ways that support rapid responses to queries and analyses for business intelligence, OLAP, and algorithmic analysis. Gartner uses the term “Hybrid Transactional and Analytical (HTAP) databases, while Forrester Research refers to these platforms as “Translytical” database products.
Both terms refer to database technologies that allow for transaction processing and analytical processing to execute at the same time on the same data. Complex analysis can be performed in real time the moment a transaction occurs, and can trigger actions or events as a result of the real time analytics findings. These database systems often leverage innovative data storage, columnar data orientation, and data compression to attempt to keep the most frequently-touched data in memory at all times, if possible. In-memory database management allows transactions to be updated quickly and simultaneously allows analytical applications to see those updates as soon as they occur.
There are a number of new database vendors that offer these types of HTAP/Translytical products, and many of the traditional database vendors now offer complementary products to their conventional products that adopt the in-memory processing principles. While these systems may have increased costs related to specialized hardware configurations and operations skills, some business case studies suggest that HTAP reduces costs by eliminating the need to maintain and operate separate OLTP and data warehouse environments and increases revenues as a result of the integrated real-time analytics.
Aside from customer engagement, use cases include incorporating machine learning, pattern recognition, and correlation for different applications such as:
- Fault detection, where specific assembly line issues are quickly recognized to generate alerts to staff members to take very specific and rapid corrective actions such as halting assembly of faulty equipment or the production of unsafe food product in order to save money and reduce liability exposure.
- Automated financial trading, where sequences of security purchase transactions can trigger hedging strategies.
- Public safety monitoring, in which life-threatening events and potential disruptions to critical infrastructure can be deterred as a result of the technology being applied in public maintenance and management applications.
- Data breach protection, in which sequences of network transactions are continuously analyzed to cyber attacks.
- Fraud analysis in health insurance that is informed in real time of potential suspicious claims.
As this class of database tools evolve, we anticipate covering this space with increased scrutiny. In future posts, we will delve more deeply into the technology, approaches, and what differentiates each vendor’s solution.