The Kafka cluster consists of multiple brokers to maintain the load across the cluster.
Zookeeper keeps the state of cluster (Brokers, Topics & Users). Zookeeper is used to manage and coordinate the Kafka broker. The Zookeeper is used to notify producer and consumer when the new broker added into existing cluster. In case of broker failed, the producer and consumer can decide to communicate with other brokers for their task.
The Kafka Broker is where the data sent is stored. When the data arrives, the brokers are responsible to receive and store it. The Zookeeper is used to collect the metadata of the cluster. In case of failures in the cluster, the brokers can use this metadata to recover from failures by coordinating with Zookeeper.
A topic is a virtual group of partitions in Kafka cluster. When receives the messages from producer, the Kafka broker stores the messages in a partition in an ordered manner by using unique offset. The Kafka broker allows to consumers to pull the messages from the topic by using partitions and offsets.
A partition is a collection of messages stored in ordered fashion by using key offset for topics. The recent published message stores in end of the partition. The consumer can pull the messages from start offset till the end offset in the partition.
Replication factor defines the number of copies of the partition. Partitions are replicated for fault tolerance. Each partition has a Leader and zero or more followers. The default replication factor is 3.
The producer is an application which can publish the messages to Kafka cluster. A producer can push the messages to Kafka Topics.
The consumer is a subscriber which can consume the message from the broker by using topic, partition and offset. The consumer updates the offsets of consumed messages by coordinating with Zookeeper.
The consumer group is a collection of consumers configured.