BSR Tutorials

Kafka - Use Cases

Messaging
- Kafka is a real time publish-subscribe messaging system.
- Kafka is the best alternative message broker to traditional message brokers like ActiveMQ or RabbitMQ.
- Kafka is useful for large scale message processing applications.
- Kafka has better throughput, built-in partitioning, replication, and fault-tolerance which are used to provide good solution for large scale processing applications.
- Kafka can buffer the unprocessed messages.
- Kafka provides the strong durability and low end-end-low latency.
Website Activity Tracking
- Kafka is used to build a user activity tracking pipeline as a set of real-time publish-subscribe feeds.
- User activities like pages views, searches, or other actions are published to Kafka Topics based on activity type.
- These Kafka Topics are available to subscribe for range of use cases
  1. real-time processing,
  2. real-time monitoring,
  3. real-time fraud detection,
  4. loading into Hadoop or offline data warehousing systems for offline processing and reporting.
- Activity tracking will get high volume as many activity messages are generated for each user page view.
Metrics
- Kafka is used to monitoring the operational data which includes the IT infrastructure, software, and security logs etc.
- For example, monitoring the data of collected key system performance metrics at periodic intervals over time.
Log Aggregation
- Many people use Kafka for log aggregation solution.
- Log aggregation collects the physical logs files from the various servers/ sources and put them in a central place including a file server, or HDFS for processing.
- Compared to other log-centric systems like Scribe or Flume, Kafka provides good performance, stronger durability guarantees due to replication, and much lower end-to-end latency.
Stream processing
- The data pipeline processing consists of multiple stages, where the raw input data is consumed from Kafka Topics and then aggregate, enriched, or transferred into new topics for further follow-up processing.
- There are processing systems such as Apache Storm, Apache Samza and Apache Spark can process the data by consuming it from Kafka Topics and then put that enriched data into new Kafka Topics for consuming by real-time application to display them into dashboards, or metrics or visualizations.
Event Sourcing
- Apache Kafka can act as Event Management System, and it ensures that all changes to application state are stored as a sequence of events.
- For example, transport vehicle tracking application. It allows us to tell when the bus/train arrives or leaves the station/bus stop.
Commit Log
- Apache Kafka can serve as an external commit-log for a distributed system for data replication between nodes and restore the data in failed nodes.

Next>>

Tutorial #1: Kafka - Introduction

Tutorial #2: Kafka - Cluster Architecture

Tutorial #3: Kafka - Advantages & Disadvantages

Tutorial #4: Kafka - Use Cases (current page)

Tutorial #5: Kafka - Clients

Tutorial #6: Kafka - Installation in Windows

Tutorial #7: Kafka - Basic Operations

Tutorial #8: Kafka - Create test producer and consumer