Guaranteed 40% Cloud Savings OR Get 100% Money Back!*

What is Apache Kafka and How Can We Install It on Linux?
Post Category: Blog > Tech

Apache Kafka Explained: Architecture, Use Cases, and Step-by-Step Installation on Linux

What is Apache Kafka and How Can We Install It on Linux?

 

TL;DR

Apache Kafka is an open-source distributed event streaming platform used to build real-time data pipelines and streaming applications. It uses a publish-subscribe model to handle high-throughput, fault-tolerant data streams.

To install Kafka on Linux:

  1. Install Java (JDK 11 or newer)
  2. Download Kafka binaries
  3. Extract and configure Kafka
  4. Start Kafka in KRaft mode (no ZooKeeper required in modern versions)
  5. Create topics and test producers/consumers

What Is Apache Kafka?

Apache Kafka is an open-source distributed event streaming platform developed by the Apache Software Foundation. It is designed to handle high-throughput, low-latency data streams across distributed systems.

Kafka allows applications to publish, subscribe, store, and process streams of records in real time.

Official project: https://kafka.apache.org/

How Apache Kafka Works

Kafka operates using a publish-subscribe model.

Core components:

  • Broker – Kafka server instance
  • Topic – Category for organizing messages
  • Producer – Sends data to Kafka
  • Consumer – Reads data from Kafka
  • Partition – Splits topics for scalability
  • Replication – Ensures fault tolerance
  • In modern Kafka (3.x+), ZooKeeper is optional. Kafka can run in KRaft mode, which removes ZooKeeper dependency.

In modern Kafka (3.x+), ZooKeeper is optional. Kafka can run in KRaft mode, which removes ZooKeeper dependency.

Why Apache Kafka Is Popular

Kafka is widely adopted because it provides:

  • High throughput (millions of messages per second)
  • Horizontal scalability
  • Built-in replication
  • Fault tolerance
  • Real-time stream processing
  • Log retention capabilities

It is commonly used in:

  • Real-time analytics
  • Log aggregation
  • Event-driven microservices
  • Financial transaction pipelines
  • Monitoring systems

System Requirements Before Installing Kafka on Linux

Minimum recommended setup:

  • Ubuntu 22.04 or any modern Linux distribution
  • 4 GB RAM minimum (8 GB recommended)
  • OpenJDK 11 or higher
  • Sudo privileges
  • 10+ GB disk space for logs

How to Install Apache Kafka on Linux (Ubuntu 22.04 Example)

How to install Apache Kafka on Linux server?

This guide uses Kafka 3.6+ in KRaft mode (recommended).

Step 1: Install Java (OpenJDK 11 or 17)

Update system:

sudo apt update

sudo apt update

Install Java:

sudo apt install openjdk-17-jdk -y

sudo apt install openjdk-11-jdk -y
Verify installation:

java -version

Java -version

Kafka requires Java 11 or later.

Step 2: Download Apache Kafka

Download Apache Kafka

Download latest Kafka binary from official site:
wget https://downloads.apache.org/kafka/3.6.1/kafka_2.13-3.6.1.tgz

Extract the Kafka zip file

Extract files:
tar -xzf kafka_2.13-3.6.1.tgz
sudo mv kafka_2.13-3.6.1 /usr/local/kafka

Move Kafka to created folder

Navigate to Kafka directory:

cd /usr/local/kafka

Step 3: Configure Kafka (KRaft Mode – No ZooKeeper)

 

Generate cluster ID:

bin/kafka-storage.sh random-uuid

Configure Kafka

Format storage:

bin/kafka-storage.sh format -t <CLUSTER_ID> -c config/kraft/server.properties

Replace <CLUSTER_ID> with the generated value.

Step 4: Start Kafka Server

Start Kafka broker:

bin/kafka-server-start.sh config/kraft/server.properties
Kafka will start on default port 9092.

To run in background:

nohup bin/kafka-server-start.sh config/kraft/server.properties &

How to Create a Kafka Topic

Open a new terminal:

cd /usr/local/kafka

Create topic:

Create a Kafka Topic

bin/kafka-topics.sh –create –topic sampleTopic –bootstrap-server localhost:9092 –partitions 1 –replication-factor 1

List topics:

bin/kafka-topics.sh --list --bootstrap-server localhost:9092

How to Send and Receive Messages in Kafka

Start Producer

bin/kafka-console-producer.sh –topic sampleTopic –bootstrap-server localhost:9092

Type messages:

Hello Kafka
Real-time streaming test

Send and Receive Messages in Kafka

Start Consumer

Open new terminal:

bin/kafka-console-consumer.sh –topic sampleTopic –bootstrap-server localhost:9092 –from-beginning

Welcome to Kafka

You should see messages appear instantly.

Optional: Running Kafka as a Systemd Service

Create service file:

sudo nano /etc/systemd/system/kafka.service
Add:
[Unit]
Description=Apache Kafka Server
After=network.target
[Service]
Type=simple
ExecStart=/usr/local/kafka/bin/kafka-server-start.sh /usr/local/kafka/config/kraft/server.properties
ExecStop=/usr/local/kafka/bin/kafka-server-stop.sh
Restart=on-failure
[Install]
WantedBy=multi-user.target

Kafka systemd file

Reload systemd:

sudo systemctl daemon-reload

Start Kafka and ZooKeeper Systemd Services

sudo systemctl enable kafka

sudo systemctl status kafka

sudo systemctl start kafka

Start Kafka and ZooKeeper Systemd Services

How to Secure Apache Kafka

Kafka security options include:

  • SSL encryption
  • SASL authentication
  • ACL authorization
  • Network-level firewall rules

For production deployments, configure:

  • listeners=SSL
  • sasl.enabled.mechanisms
  • authorizer.class.name

Refer to official Kafka security documentation.

What Is Apache Kafka Used For?

What is Kafka used for?

Kafka is used for:

  • Event streaming
  • Real-time analytics
  • Log aggregation
  • Microservices communication
  • Fraud detection systems
  • IoT data ingestion

Companies like LinkedIn, Netflix, Uber, and Airbnb use Kafka to handle massive data pipelines.

Top 5 Apache Kafka Use Cases

  1. Website activity tracking
  2. Distributed log collection
  3. Real-time financial transactions
  4. IoT sensor data streaming
  5. Event-driven architecture backbone

Industries That Benefit From Kafka

Industries Benefit from Kafka

  • Financial Services
  • Retail & E-commerce
  • Telecom
  • Gaming
  • Healthcare
  • Logistics
  • Automotive

Any industry handling high-volume event data can benefit from Kafka.

People Also Ask(And You Should Too!)

Q) What is Apache Kafka in simple terms?

A) Apache Kafka is a distributed messaging system that enables applications to send and receive real-time streams of data reliably and at scale.

Q) Does Kafka require ZooKeeper?

A) Older versions required ZooKeeper. Modern Kafka (3.x+) supports KRaft mode, which removes the ZooKeeper dependency.

Q) Is Kafka a message broker?

A) Yes. Kafka acts as a distributed message broker and event streaming platform.

Q) What port does Kafka use?

A) Kafka uses port 9092 by default.

Q) Is Kafka suitable for beginners?

A) Kafka is powerful but can be complex. Beginners should start with single-node local installations before scaling to clusters.

Final Summary

Apache Kafka is a distributed streaming platform used for building real-time data pipelines and event-driven systems. Installing Kafka on Linux involves installing Java, downloading Kafka binaries, configuring storage in KRaft mode, and starting the broker.

Once installed, you can create topics, produce messages, and consume real-time streams efficiently.

* View Product limitations and legal policies

All third-party logos and trademarks displayed on AccuWeb Cloud are the property of their respective owners and are used only for identification purposes. Their use does not imply any endorsement or affiliation.

Product limitations and legal policies

* Pricing Policy
To know about how the pricing is calculated please refer to our Terms and Conditions.