Savindu Pasintha
2 min readOct 22, 2023

Common ๐—จ๐˜€๐—ฒ ๐—–๐—ฎ๐˜€๐—ฒ๐˜€ ๐—ณ๐—ผ๐—ฟ ๐—ž๐—ฎ๐—ณ๐—ธ๐—ฎ ๐Ÿค”

While weโ€™ve explored numerous Kafka concepts, letโ€™s now focus on the primary use cases that Data Engineers frequently encounter when working with this system.

๐—จ๐˜€๐—ฒ ๐—–๐—ฎ๐˜€๐—ฒ๐˜€ ๐—ณ๐—ผ๐—ฟ ๐—ž๐—ฎ๐—ณ๐—ธ๐—ฎ

๐—Ÿ๐—ฒ๐˜โ€™๐˜€ ๐˜๐—ฎ๐—ธ๐—ฒ ๐—ฎ ๐—ฐ๐—น๐—ผ๐˜€๐—ฒ๐—ฟ ๐—น๐—ผ๐—ผ๐—ธ:

๐—ช๐—ฒ๐—ฏ๐˜€๐—ถ๐˜๐—ฒ ๐—”๐—ฐ๐˜๐—ถ๐˜ƒ๐—ถ๐˜๐˜† ๐—ง๐—ฟ๐—ฎ๐—ฐ๐—ธ๐—ถ๐—ป๐—ด.

โžก๏ธ The Original use case for Kafka by LinkedIn.
โžก๏ธ Events happening in the website like page views, conversions etc. are sent via a Gateway and piped to Kafka Topics.
โžก๏ธ These events are forwarded to the downstream Analytical systems or processed in Real Time.
โžก๏ธ Kafka is used as an initial buffer as the Data amounts are usually big and Kafka guarantees no message loss due to its replication mechanisms.

๐——๐—ฎ๐˜๐—ฎ๐—ฏ๐—ฎ๐˜€๐—ฒ ๐—ฅ๐—ฒ๐—ฝ๐—น๐—ถ๐—ฐ๐—ฎ๐˜๐—ถ๐—ผ๐—ป.

โžก๏ธ Database Commit log is piped to a Kafka topic.
โžก๏ธ The committed messages are executed against a new Database in the same order.
โžก๏ธ Database replica is created.

๐—Ÿ๐—ผ๐—ด/๐— ๐—ฒ๐˜๐—ฟ๐—ถ๐—ฐ๐˜€ ๐—”๐—ด๐—ด๐—ฟ๐—ฒ๐—ด๐—ฎ๐˜๐—ถ๐—ผ๐—ป.

โžก๏ธ Kafka is used for centralized Log and Metrics collection.
โžก๏ธ Daemons like FluentD are deployed in servers or containers together with the Applications to be monitored.
โžก๏ธ Applications send their Logs/Metrics to the Daemons.
โžก๏ธ The Daemons pipe Logs/Metrics to a Kafka Topic.
โžก๏ธ Logs/Metrics are delivered downstream to storages like ElasticSearch or InfluxDB for Log/Metrics discovery respectively.
โžก๏ธ This is also how you would track your IoT Fleets.

๐—ฆ๐˜๐—ฟ๐—ฒ๐—ฎ๐—บ ๐—ฃ๐—ฟ๐—ผ๐—ฐ๐—ฒ๐˜€๐˜€๐—ถ๐—ป๐—ด.

โžก๏ธ This is usually coupled with ingestion mechanisms already covered.
โžก๏ธ Instead of piping Data to a certain storage downstream we mount a Stream Processing Framework on top of Kafka Topics.
โžก๏ธ The Data is filtered, enriched and then piped to the downstream systems to be further used according to the use case.
โžก๏ธ This is also where one would be running Machine Learning Models embedded into a Stream Processing Application.

๐— ๐—ฒ๐˜€๐˜€๐—ฎ๐—ด๐—ถ๐—ป๐—ด.

โžก๏ธ Kafka can be used as a replacement for more traditional messaging brokers like RabbitMQ.
โžก๏ธ Kafka has better durability guarantees and is easier to configure for several separate Consumer Groups to consume from the same Topic.

#WebActivityTracking #OriginalUseCase #PageViews #Conversions #Gateway #KafkaTopics #AnalyticalSystems #RealTimeProcessing #DataReplication #DatabaseCommitLog #DatabaseReplica #LogAndMetricsCollection #FluentD #CentralizedLogging #MetricsDiscovery #IoTFleets #StreamProcessing #StreamProcessingFramework #MachineLearningModels #MessagingBrokers #RabbitMQ #DurabilityGuarantees #ConsumerGroups

Connect With Linkedin Savindu-Pasintha

Connect With Git-hub Savindu-Pasintha

Thanks.

Savindu Pasintha
Savindu Pasintha

No responses yet