

Brief notes on Kafka-Producers-Consumers
Main Intent of Kafka is -> Real‑time data pipelines. It can handle millions of events per second.
Kafka is built for streaming:
- logs
- metrics
- transactions
- clickstreams
- IoT sensor data
- database change events
Previously ,Step-by-step flow :
- ZooKeeper starts Brokers register and elect a controller.
- Kafka brokers start They open port 9092 and wait for clients.
- Producers connect They send messages to the correct partition leader.
- Consumers connect They join a group and fetch messages.
In the latest flow , Kafka handles the zookeeper role.there is no more need of Zookeeper:
- Producers publish events (orders, clicks, logs, metrics).
- Kafka stores them durably in partitions.
- Consumers read them at their own pace.
- Downstream systems (Flink, Spark, Snowflake, microservices) react to events.
Kafka becomes the central nervous system of the company.
Producer internal flow
- Connects to a broker (bootstrap server).
- Requests metadata (topic → partitions → leaders).
- Chooses a partition (round‑robin, key-based, custom).
- Sends the record to the leader broker of that partition.
- Broker writes it to the log and replicates to followers.
- Producer gets an acknowledgment (acks=0/1/all).
Consumer internal flow
- Connects to a broker.
- Joins a consumer group.
- Kafka elects a group coordinator.
- Coordinator assigns partitions to consumers.
- Consumer fetches data from the leader broker.
- Consumer commits offsets (manual or auto).
- If a consumer dies, Kafka rebalances and reassigns partitions.
Consumers don’t “listen” — they pull data when ready.
Simple steps of setting up a streaming data pipeline using Kafka( Task1 to 6)
Task 1: Configure Zookeeper on Localhost
Apache Zookeeper is a server that is used for distributed coordinates of cloud collaboration. We need to configure our Zookeeper and run it to start with Kafka. To perform this task, open the /usercode/kafka/config/server.properties file.
- 9092 will be your port number. This port number will be used to communicate with the Zookeeper.
- After this edit, you’ll be good to run the Zookeeper and Kafka services.
Task 2: Start Zookeeper -> Zookeeper is a naming registry that is used in distribution systems for service synchronization.
In Kafka, the Zookeeper is responsible for:
- Managing
- Tracking the status of the Kafka cluster’s nodes, topics, and messages
We need to run the Zookeeper shell script and pass our properties to start our Zookeeper server.
In the config directory, return to the bin directory and then run the Zookeeper server. Follow these steps to move back to the bin directory and run the Zookeeper.
cd /usercode/kafka/bin
./zookeeper-server-start.sh ../config/zookeeper.properties
Your Zookeeper server has started and you are now good to start your Kafka server too.
Task 3: Start Kafka -> in Task2, we successfully started our Zookeeper server. Now, let’s begin with our Kafka server. To start our Kafka server, we need a new terminal.
- To start your Kafka server, run your kafka-server-start.sh file and pass server.properties as the argument. You can do so with the following command:
cd /usercode/kafka/bin
./kafka-server-start.sh ../config/server.properties
Now we have successfully started our Kafka server as well.
Task 4: Creating a New Topic -> A topic in Kafka is a virtual group of one or many partitions. Let’s create a new topic for our project.
To create a topic in Kafka, you need to use your Kafka topic file, which has multiple arguments to pass:
- An argument that specifies the creation of a topic
- Another argument that specifies the topic name as the argument
- Localhost on which our Kafka server is running
- Replication factor and the number of partitions in a topic
Now that the Zookeeper and Kafka servers are up and running, create a topic using the following command:
cd /usercode/kafka/bin
./kafka-topics.sh --create --topic test-topic --bootstrap-server localhost:9092 --replication-factor 1 --partitions 4
Arguments
- kafka-topics.sh: This is used to manage the topics in our Kafka server.
- --create: This indicates that we are creating a new topic.
- --topic topic: Here, we specify our topic name.
- --bootstrap-server localhost:9092: Here, we specify the server and the hostname and port number on which our Kafka server is up and running.
- --replication-factor: This is used to give the number of replication factors in our topic.
- --partitions: At the end of our command, we define the number of partitions in our Kafka topic.
If at any point your application is disconnected, you can use the restart.sh file available in the usercode/kafka directory to connect it again. Using this file, we can run all of these services:
- Zookeeper
- Kafka
- Create a new topic with the name test-topic
To run this file, you need to follow these commands: Reach the kafka directory using this command:
cd /usercode/kafka
./restart.sh
Task 5: Creating a New Producer
The producer in Kafka sends messages/data to other partitions in a topic. The producer needs to know the topic name and the hostname where the topic is available and then send the message to all the consumers available on that topic.
cd /usercode/kafka/bin
Now, you need to create a producer, which will be used to produce/send data to the topic. To make a producer, use the following command:
./kafka-console-producer.sh --broker-list localhost:9092 --topic test-topic
- kafka-console-producer.sh: We are using our terminal to create a producer, so we have used this console producer.
- localhost:9092: We specify our hostname and the port number here.
- --topic test-topic: This is used to specify the topic name on which we create our producer.
Running this command will create a new producer, and we will see the > sign in the terminal. Then, we will be able to write any message and press the enter key to send it.
Task 6: Create a New Consumer
A consumer in Kafka receives/consumes the messages/data generated by the producer. We can use the console-based Kafka consumer file available in our bin directory to create a new consumer.
To create a consumer, we will need to pass the hostname and topic name in the command.
cd /usercode/kafka/bin
- Create a new console-based consumer in your topic that will consume/receive the messages sent by the producer by using the following command:
./kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic test-topic --from-beginning
- kafka-console-consumer.sh: We are using this to create a console-based consumer.
- from-beginning: Specifies that all the messages available in the Kafka stream should be received.
Thanks,
