An in-memory, open-source engine for processing large volumes of data that supports data science, data engineering, and SQL workloads on single nodes or clusters.
Added Perspectives
The Spark platform prepares the data in micro-batches to be consumed by the HDInsight data lake, SQL data warehouse, and various other internal and external subscribers. These targets subscribe to topics that are categorized by source tables. With this CDC-based architecture, StartupBackers is now efficiently supporting real-time analysis without affecting production operations.