An open-source system for data streaming in which a broker queues and persists data records (also called events or messages) between sources (publishers) and targets (subscribers). Sequences of these records are sorted into streams called topics.
In a nutshell, Kafka “producers” publish data records to brokers that persist those records to file systems on disk until they are read – either real-time or later on – by “consumers.” Records are divided into topics (a.k.a. streams) to which consumers can subscribe for selected use. Topics also can be partitioned to improve throughput via parallel consumption and enable redundancy via partition replicas across multiple clustered brokers. Here is a sample architecture for one topic. While this shows multiple examples of producers, one topic often maps to a single producer.