Register for "Diverse, Integrated, and Real Time: Delivering the Right Data for AI/ML Success" - Tuesday, June 18,1:00 pm EST

GPU Databases: Getting more Value from your Machine Learning Infrastructure

The last several months have seen a lot of announcements related to Graphics Processing Units (GPU).  A few of my favorites are NVIDIA’s launch of the Ampere architecture, Apache Spark support for GPUs, and NVIDIA’s partnership with my alma mater, the University of Florida, to build the largest supercomputer in academia. Much of the focus and industry growth in the use of GPUs has come from their suitability for machine learning, especially neural networks, more commonly known as deep learning.


Companies have invested significantly in procuring GPUs, whether in the cloud, the data center, or on the desktop.  The trajectory of NVIDA’s stock price is certainly evidence of that. While the vast majority of this infrastructure investment is supporting machine learning and high-performance computing, the remainder is in visualization, simulation, computer-aided design (CAD), and of course, gaming.   

A new application for GPUs emerged over the last few years and is reaching critical mass within the enterprise: GPU databases. Given the availability of GPU infrastructure in the enterprise, GPU databases have a strong value proposition for companies looking to extend their hardware or cloud investment as well as increase the productivity of their data scientists and engineers. 

A Little History

The genesis of GPU-enabled databases finds its origins in NVIDIA’s entrance in the graphics card market in the 1990s. In 1999 NVIDIA introduced the concept of the Graphics Processing Unit, which NVIDIA defined as, “a single-chip processor with integrated transform, lighting, triangle setup/clipping, and rendering engines capable of processing a minimum of 10 million polygons per second.”  In the same year, Epic Games and Activision released Unreal Tournament and Quake III Arena, respectively. These were two of many titles to come that could easily crush the hardware capabilities of any commercial or home-built gaming rig. As a result, the hardware industry was driven to develop faster solutions to keep up with the user experience envisioned by the game publishers.

The next major milestone was NVIDIA’s release of the Compute Unified Device Architecture (CUDA) in 2007. CUDA aimed to make it easier for developers to use the GPU for more general computing tasks by providing a layer of abstraction to the hardware and harness the incredible speed and parallelism of the NVIDIA architecture. Slowly but surely, CUDA was enabling GPU acceleration across a host of applications, but mainly those using vector math.


The genesis of GPU-enabled databases finds its origins in NVIDIA’s entrance in the graphics card market in the 1990s.


MPP Databases. In the following year, during my time at Sun Microsystems, I was introduced to a new class of database platforms: massively parallel processing (MPP). The term “big data” was going mainstream. We were beginning to run into customer problems in which legacy monolithic RDBMS solutions were inadequate to support the volume, velocity, or variety of data of internet-scale business processes.  Because I focused on analytics and reporting solutions, I became familiar with MPP databases from Greenplum (now Pivotal), ParAccel (Actian, as well as the initial codebase for AWS Redshift), Kognitio, and Vertica. The columnar architecture implemented on commodity hardware provided stiff competition for proprietary solutions from Netezza and Teradata

Given what I understood about these MPP architectures, I explored if anyone considered using the performance and parallelization of GPUs to handle the same types of database loads. Through my research, I discovered a paper titled,  “Relational Joins on Graphics Processors,” which discussed how to use the NVIDIA CUDA framework to parallelize the algorithms used to support SQL JOIN operations. Executing these algorithms in-memory on a GPU resulted in an increase in query speed of 2 to 7 times compared to optimized CPU-based approaches. It would not be until 2015 that the first native GPU databases would come to market. I suspect that the CUDA framework and MPP technology needed to mature before anyone could field a stable and performant native GPU database solution.

The What and Why of GPU Databases

Today, I recognize GPU databases as a subset of MPP databases. Like MPP databases, GPU databases run on commodity hardware, implement a parallel architecture, and use columnar tables, summary metadata, and data compression. At the time of this writing, there are four classes of GPU databases available on the market:

  1. Relational, Accelerated. These solutions take a conventional database architecture and accelerate certain functions, usually aggregations and other vectorizations, and allocate those workloads to GPUs. Although there is some performance benefit, this is the least efficient implementation.
  2. Relational, In-memory. These database solutions are written from the ground up to run natively on GPUs. They use GPU and system memory to deliver extremely fast query operations. This in-memory only operation limits the size of data sets available for analysis to 1 to 10 TB, which makes them well suited for purpose-built data science workstations.
  3. Relational, Out-of-memory. These GPU solutions do not have an in-memory limitation since they process data from attached storage as well as process in-memory. The architecture may allow for clusters of GPU servers, more closely resembling legacy MPP databases, enabling them to analyze petabytes of data.
  4. Non-relational. These are NoSQL or other types of non-relational databases. As mentioned in the introduction, Spark 3.x now supports GPU acceleration. Graph databases lend themselves well to a native or accelerated GPU implementation because of the high degree of vectorization in the data model. It is important to note that GPUs perform best on structured data.

Regardless of the architectural approach, all GPU databases are optimized for analytics workloads. These are not designed as replacements for transactional systems, though some are ACID compliant.  Common use cases for GPU databases include:

  • Real-time visualization or ad-hoc querying.
  • Real-time, stream processing of many small streams of data.
  • Almost any graph analysis.

Value Proposition

There are three primary value propositions for considering a GPU database.

Price-Performance. For the vast majority of analytics workloads, one will be hard-pressed to implement a solution with a higher compute density. Early comparisons show cost savings of two to three orders of magnitude over CPU-based MPP solutions. Those analyses are data center centric, and mileage may vary when deploying in the cloud.

Familiarity. While many NoSQL approaches provide alternative high-performance analytics solutions, they require users to develop in new and unfamiliar paradigms. GPU databases leverage ANSI standard SQL with minimal extensions. From the first day, users skilled in SQL can be productive.

Increased Hardware Utilization. Many enterprises are investing in infrastructure, whether in the data center or the cloud, to support machine learning projects. Adding a GPU database to those same resources could be used to drive business analytics to other parts of the organization. Further, data scientists and engineers often have workstations equipped with GPU cards to support their activities, similar to my own system pictured in figure 1. Installing a GPU database on those workstations could improve overall productivity and reduce the need for additional cloud-based databases.

Figure 1. Internals of my custom data science workstation with an NVIDIA Quadro RTX 6000.

Getting Started

There are compelling reasons to evaluate including GPU databases in the overall enterprise data management ecosystem. The value proposition is strong, but it is dependent on having the right business drivers. Talk to business users to get an understanding of performance pain points and identify current or future real-time or near real-time insight requirements. An excellent place to test the feasibility of the technology and fit with the enterprise may be with the data science and engineering organization.