Apache Kafka Explained: Architecture, Use Cases, and Step-by-Step Installation on Linux
TL;DR
Apache Kafka is an open-source distributed event streaming platform used to build real-time data pipelines and streaming applications. It uses a publish-subscribe model to handle high-throughput, fault-tolerant data streams.
To install Kafka on Linux:
- Install Java (JDK 11 or newer)
- Download Kafka binaries
- Extract and configure Kafka
- Start Kafka in KRaft mode (no ZooKeeper required in modern versions)
- Create topics and test producers/consumers
What Is Apache Kafka?
Apache Kafka is an open-source distributed event streaming platform developed by the Apache Software Foundation. It is designed to handle high-throughput, low-latency data streams across distributed systems.
Kafka allows applications to publish, subscribe, store, and process streams of records in real time.
Official project: https://kafka.apache.org/
How Apache Kafka Works
Kafka operates using a publish-subscribe model.
Core components:
- Broker β Kafka server instance
- Topic β Category for organizing messages
- Producer β Sends data to Kafka
- Consumer β Reads data from Kafka
- Partition β Splits topics for scalability
- Replication β Ensures fault tolerance
- In modern Kafka (3.x+), ZooKeeper is optional. Kafka can run in KRaft mode, which removes ZooKeeper dependency.
In modern Kafka (3.x+), ZooKeeper is optional. Kafka can run in KRaft mode, which removes ZooKeeper dependency.
Why Apache Kafka Is Popular
Kafka is widely adopted because it provides:
- High throughput (millions of messages per second)
- Horizontal scalability
- Built-in replication
- Fault tolerance
- Real-time stream processing
- Log retention capabilities
It is commonly used in:
- Real-time analytics
- Log aggregation
- Event-driven microservices
- Financial transaction pipelines
- Monitoring systems
System Requirements Before Installing Kafka on Linux
Minimum recommended setup:
- Ubuntu 22.04 or any modern Linux distribution
- 4 GB RAM minimum (8 GB recommended)
- OpenJDK 11 or higher
- Sudo privileges
- 10+ GB disk space for logs
How to Install Apache Kafka on Linux (Ubuntu 22.04 Example)
This guide uses Kafka 3.6+ in KRaft mode (recommended).
Step 1: Install Java (OpenJDK 11 or 17)
Update system:
sudo apt update
Install Java:
sudo apt install openjdk-17-jdk -y
java -version
Kafka requires Java 11 or later.
Step 2: Download Apache Kafka
Download latest Kafka binary from official site:
wget https://downloads.apache.org/kafka/3.6.1/kafka_2.13-3.6.1.tgz
Extract files:
tar -xzf kafka_2.13-3.6.1.tgz
sudo mv kafka_2.13-3.6.1 /usr/local/kafka
Navigate to Kafka directory:
cd /usr/local/kafka
Step 3: Configure Kafka (KRaft Mode β No ZooKeeper)
Generate cluster ID:
bin/kafka-storage.sh random-uuid
Format storage:
bin/kafka-storage.sh format -t <CLUSTER_ID> -c config/kraft/server.properties
Replace <CLUSTER_ID> with the generated value.
Step 4: Start Kafka Server
Start Kafka broker:
bin/kafka-server-start.sh config/kraft/server.properties
Kafka will start on default port 9092.
To run in background:
nohup bin/kafka-server-start.sh config/kraft/server.properties &
How to Create a Kafka Topic
Open a new terminal:
cd /usr/local/kafka
Create topic:
bin/kafka-topics.sh –create –topic sampleTopic –bootstrap-server localhost:9092 –partitions 1 –replication-factor 1
List topics:
bin/kafka-topics.sh --list --bootstrap-server localhost:9092
How to Send and Receive Messages in Kafka
Start Producer
bin/kafka-console-producer.sh –topic sampleTopic –bootstrap-server localhost:9092
Type messages:
Hello Kafka
Real-time streaming test
Start Consumer
Open new terminal:
bin/kafka-console-consumer.sh –topic sampleTopic –bootstrap-server localhost:9092 –from-beginning
You should see messages appear instantly.
Optional: Running Kafka as a Systemd Service
Create service file:
sudo nano /etc/systemd/system/kafka.service
Add:
[Unit]
Description=Apache Kafka Server
After=network.target
[Service]
Type=simple
ExecStart=/usr/local/kafka/bin/kafka-server-start.sh /usr/local/kafka/config/kraft/server.properties
ExecStop=/usr/local/kafka/bin/kafka-server-stop.sh
Restart=on-failure
[Install]
WantedBy=multi-user.target
Reload systemd:
sudo systemctl daemon-reload
sudo systemctl enable kafka
sudo systemctl start kafka
How to Secure Apache Kafka
Kafka security options include:
- SSL encryption
- SASL authentication
- ACL authorization
- Network-level firewall rules
For production deployments, configure:
- listeners=SSL
- sasl.enabled.mechanisms
- authorizer.class.name
Refer to official Kafka security documentation.
What Is Apache Kafka Used For?
Kafka is used for:
- Event streaming
- Real-time analytics
- Log aggregation
- Microservices communication
- Fraud detection systems
- IoT data ingestion
Companies like LinkedIn, Netflix, Uber, and Airbnb use Kafka to handle massive data pipelines.
Top 5 Apache Kafka Use Cases
- Website activity tracking
- Distributed log collection
- Real-time financial transactions
- IoT sensor data streaming
- Event-driven architecture backbone
Industries That Benefit From Kafka
- Financial Services
- Retail & E-commerce
- Telecom
- Gaming
- Healthcare
- Logistics
- Automotive
Any industry handling high-volume event data can benefit from Kafka.
People Also Ask(And You Should Too!)
Q) What is Apache Kafka in simple terms?
A) Apache Kafka is a distributed messaging system that enables applications to send and receive real-time streams of data reliably and at scale.
Q) Does Kafka require ZooKeeper?
A) Older versions required ZooKeeper. Modern Kafka (3.x+) supports KRaft mode, which removes the ZooKeeper dependency.
Q) Is Kafka a message broker?
A) Yes. Kafka acts as a distributed message broker and event streaming platform.
Q) What port does Kafka use?
A) Kafka uses port 9092 by default.
Q) Is Kafka suitable for beginners?
A) Kafka is powerful but can be complex. Beginners should start with single-node local installations before scaling to clusters.
Final Summary
Apache Kafka is a distributed streaming platform used for building real-time data pipelines and event-driven systems. Installing Kafka on Linux involves installing Java, downloading Kafka binaries, configuring storage in KRaft mode, and starting the broker.
Once installed, you can create topics, produce messages, and consume real-time streams efficiently.

Jilesh Patadiya, the visionary Founder and Chief Technology Officer (CTO) behind AccuWeb.Cloud. Founder & CTO at AccuWebHosting.com. He shares his web hosting insights on the AccuWeb.Cloud blog. He mostly writes on the latest web hosting trends, WordPress, storage technologies, and Windows and Linux hosting platforms.





