Data Architecture as a Service: How to Evaluate Products
ABSTRACT: Products that support data architecture-as-a-service are emerging. This article describes what to look for in a commercial DAaaS-based product.
Data architecture as a service or DAaaS is a timeless concept wrapped within some new commercial data pipeline products. The goal of DAaaS is to inject architectural standards and SQL guardrails into no-code data development tools so business users can develop their own data pipelines without IT assistance and without creating data silos. By abstracting architecture within a no-code environment, DAaaS-based tools align data pipelines and repositories with enterprise data standards. (See “Data Architecture-as-a-Service: Liberation for Data Users.”
From a developer perspective, DAaaS fosters reuse, accelerates development, and eliminates costly errors that arise when inexperienced developers try to create complex SQL. It’s a metadata-driven approach that uses configurable templates within commercial data pipeline products that automatically generate SQL code. DAaaS enables data analysts, data scientists and new data developers to leverage the work of experienced architects and engineers without having to learn the intricacies of SQL coding and data pipeline design and development.
The Market for DAaaS Products
One of the first vendors to introduce a DAaaS product is Coalesce, which offers a low-code/no-code data transformation product that enables data architects to build guardrails into GUI-based development environments via templates so that business analysts can build architecturally compliant data repositories and pipelines.
We expect many other vendors to introduce DAaaS-based products in the next two or three years. Every vendor wants to help business accelerate, if not automate, the development of data pipelines to feed analytical and operational requirements. DAaaS promises to remove the last remaining bottleneck between business users and data. As such, DAaaS represents the culmination of self-service in the analytics field.
DAaaS promises to remove the last remaining bottleneck between business users and data.
To prepare for the deluge of user-friendly data development products, we’ve prepared a checklist of features and functions that we think every DAaaS-compliant product should support. Although every vendor will take a different approach to abstracting architecture, we think there are core principles that characterize a DAaaS product. As in any emerging discipline, this list is bound to evolve as customers implement these products in production environments.
Here are criteria for evaluating DAaaS products:
Configurable. Can architects or data engineers configure the product using templates, blocks, or other constructs to simplify the development of data pipelines?
Multi-code. Does the product support no-code, low-code, and all-code development so that it can be used by all types of users?
Metadata-driven. Does the product automatically generate SQL code that developers can view and modify if desired?
Universal updates. Can developers update a template in one place and ripple changes automatically to all solutions that leverage the code?
Platform agnostic. Does the product run on multiple data platforms, adjusting SQL output as needed? Can users migrate platforms without rewriting code?
Connectors. Does the tool connect to multiple source systems and load it into multiple targets?
Orchestration. Can the product execute transformations and tasks according to a workflow designed within a directed acyclic graph?
Data lineage. Does the tool track lineage of data elements from source to target?
Data catalog. Can users search and reuse existing data pipelines or data assets to speed development and promote reuse?
Documentation. Does the tool automatically generate documentation to speed onboarding of new developers and provide to auditors?
Next Generation DWA. The features above align with a prior generation of products that automated the generation of data warehousing structures. Like data warehouse automation (DWA) products, DAaaS products rely on metadata to automatically generate SQL code, simplify changes, eliminate errors, and auto-generate documentation, including lineage. The main difference is that DAaaS products are low-code, no-code environments geared to both business users and IT developers, not just IT developers. And DAaaS products are designed to run in the cloud as easy-to-use SaaS products.
DAaaS is not a new concept. Architects and engineers have long created development templates to foster reuse, speed development, and improve accuracy and efficiency. But DAaaS commercializes this approach, baking templates and automation into low code/no code data pipeline products geared to business users. The ten criteria above represent base-level functionality that we believe all DAaaS products should support.