Development

How to install and Configure Apache Kafka on Debian

Captain Salem 10 min read

How to install and Configure Apache Kafka on Debian

Apache Kafka, commonly known as Kafka is a free and open-source distributed event streaming system. It is a powerful tool that performs as a message broker capable of handling a large volume of real-time data.

Therefore, if you are looking to implement a message broker or a pub/sub system, Apache Kafka is a great tool with tons of features out of the box.

In this tutorial, we will show you how to get started with Kafka by installing and configuring the Kafka system on a Debian or any Debian based distribution.

NOTE: Although Kafka itself is heavily optimized for any use case. This tutorial is not production ready. Feel free to explore your own security measures when adopting Kafka.

Requirements

Before installing Kafka, you will need the following:

  1. A Debian system or any Debian based distribution.
  2. a root user or sudo permissions
  3. A valid JDK installation.

You can learn how to install the Amazon Corretto JDK in the resource below:

https://www.geekbits.io/how-to-install-amazon-corretto-jdk-on-ubuntu/

Once you have the above requirements met, we can proceed.

Download the Kafka Archives

Let us start by downloading and extracting the Kafka binaries. Open your terminal and navigate into the directory where you wish to store Kafka.

In our example, we will user the Downloads folder in the debian user.

cd ~/Downloads

Next, use the the wget command to download the kafka archive.

wget https://dlcdn.apache.org/kafka/3.2.0/kafka_2.12-3.2.0.tgz

The command will use wget to download the archive and save it in the Downloads folder. You can check the resource below for the latest kafka binary.

https://kafka.apache.org/downloads

Next, extract the Kafka archive as:

tar -zxvf kafka_2.12-3.2.0.tgz

Replace the above command with name of the downloaded archive.

Once extracted, we need to move the kafka directory into a better location other than the Downloads folder. In our example, we will the /opt directory.

Run the command:

sudo mv kafka_2.12-3.2.0 /opt/kafka

The command will move the extracted Kafka directory and save it into the /opt directory.

We can verify this by running the command:

ls -la /opt/kafka

This should return the directory listing for the kafka directory as:

total 72
drwxr-xr-x 7 debian debian  4096 May  3 15:56 .
drwxr-xr-x 3 root   root    4096 Jul 15 13:59 ..
drwxr-xr-x 3 debian debian  4096 May  3 15:56 bin
drwxr-xr-x 3 debian debian  4096 May  3 15:56 config
drwxr-xr-x 2 debian debian  4096 Jul 15 13:56 libs
-rw-r--r-- 1 debian debian 14640 May  3 15:52 LICENSE
drwxr-xr-x 2 debian debian  4096 May  3 15:56 licenses
-rw-r--r-- 1 debian debian 28184 May  3 15:52 NOTICE
drwxr-xr-x 2 debian debian  4096 May  3 15:56 site-docs

Configuring the Kafka Server

Once we have the Apache Kafka directories and binaries ready, we can proceed and configure the server to run on our system.

Enable Kafka Topic Delete

The first step is to allow Kafka to delete any topic you specify. This feature is disabled by default for security reasons. To enable Kafka topic delete, edit the server configuration file as:

sudo nano /opt/kafka/config/server.properties

Navigate to the bottom of the file and add the entry shown below:

delete.topic.enable = true

Save and close the file.

Create Systemd Unit Files and Start the Kafka Service

Although we can manage the Kafka server from the bin directory, it is best practice to create a service and manage it via systemd.

We will start by creating a unit file for the Zookeeper and Kafka services. If you navigate into the kafka/bin directory, you will see files as:

-rwxr-xr-x 1 debian debian  1376 May  3 15:52 kafka-server-start.sh
-rwxr-xr-x 1 debian debian  1361 May  3 15:52 kafka-server-stop.sh
-rwxr-xr-x 1 debian debian  1393 May  3 15:52 zookeeper-server-start.sh
-rwxr-xr-x 1 debian debian  1366 May  3 15:52 zookeeper-server-stop.sh
-rwxr-xr-x 1 debian debian  1019 May  3 15:52 zookeeper-shell.sh

Kafka uses these zookeeper and kafka-server files to start, restart, and stop the kafka services. Let us create a systemd units file and use these files to manage the server.

Run the command:

sudo touch /etc/systemd/zookeeper.service

The command above will create a unit file for the zookeeper service. You can rename this file to any filename you wish.

Next, edit the unit file:

sudo nano /etc/systemd/zookeeper.service

In the unit file, add the following entries.

[Unit]
Description=Apache Zookeeper Service
Documentation=http://zookeeper.apache.org
Requires=network.target remote-fs.target
After=network.target remote-fs.target

[Service]
Type=simple
# replace the 'debian' user with the user you wish to run the kafka service
User=debian
# replace the value of JAVA_HOME with the location of the Java JDK
Environment=JAVA_HOME=/usr/lib/jvm/java-11-openjdk-amd64
ExecStart=/opt/kafka/bin/zookeeper-server-start.sh /opt/kafka/config/zookeeper.properties
ExecStop=/opt/kafka/bin/zookeeper-server-stop.sh
Restart=on-abnormal

[Install]
WantedBy=multi-user.target

Let us break down what the above unit file is doing.

In the [Unit] section, we define the metadata for the unit file and the relationship with other unit files.

In this case, we specify the description for the service and the link to the documentation resource.

Next, we specify the services we need before starting the zookeeper server. In our example, we need networking and filesystem to be ready before running.

Next comes the [Service] section. This section allows us to specify the configuration required for that service.

Here, we define the path to the Java JDK which is required by Kafka. We also define the path to the scripts that systemd should use to start and stop the service.

NOTE: We also specify the configuration files we wish to use when starting the service.

If the service should exit abnormally, we tell systemd to start it as specified by the Restart block.

Now, close the zookeeper.service file and save the changes.

We are not quite done yet. The next step is to create a Kafka service file.

Run the command:

sudo touch /etc/systemd/kafka.service

This will create a unit file for the Kafka server.

Edit the file with your favorite text editor:

sudo nano /etc/systemd/kafka.service

In the above file, add the following entries:

[Unit]
Description=Apache Kafka Server
Documentation=http://kafka.apache.org/documentation.html
Requires=network.target remote-fs.target
After=network.target remote-fs.target zookeeper.service

[Service]
Type=simple
# replace the user below
User=debian
# replace with the path to the Java JDK
Environment=JAVA_HOME=/usr/lib/jvm/java-11-openjdk-amd64
ExecStart=/opt/kafka/bin/kafka-server-start.sh /opt/kafka/config/server.properties
ExecStop=/opt/kafka/bin/kafka-server-stop.sh

[Install]
WantedBy=multi-user.target

You will notice that the format is similar to the zookeeper unit file. However, in this case, we tell systemd to use the kafka-server-start.sh and kafka-server-stop.sh files to start and stop the service.

Edit Socket Server Settings

The next step is to define the address on which the server will listen. This value is optional but required if you need to configure a custom port for the server.

Edit the server properties as:

sudo nano /opt/kafka/config/server.properties

Locate the entry below:

# listeners = PLAINTEXT://:9092

Uncomment the line above by removing the line # sign. Next, edit the address you wish Kafka to use. Example is as shown:

listeners=PLAINTEXT://localhost:9092

In this case, we tell Kafka to listen on localhost on port 9092. Feel free to change this value as you see fit.

Save and close the file.

Start the Systemd Services and Reload the Daemon

Once you have everything in place, its to time to start and enable the servicies.

Start by enabling the zookeeper and kafka services.

sudo systemctl enable /etc/systemd/zookeeper.service
sudo systemctl enable /etc/systemd/kafka.service

Ensure to specify the full path to the unit files.

You should see an output as shown:

sudo systemctl enable /etc/systemd/zookeeper.service
Created symlink /etc/systemd/system/multi-user.target.wants/zookeeper.service → /etc/systemd/zookeeper.service.
Created symlink /etc/systemd/system/zookeeper.service → /etc/systemd/zookeeper.service.

sudo systemctl enable /etc/systemd/kafka.service
Created symlink /etc/systemd/system/multi-user.target.wants/kafka.service → /etc/systemd/kafka.service.
Created symlink /etc/systemd/system/kafka.service → /etc/systemd/kafka.service.

Once you have enabled the services, run the commands below to start the zookeeper and kafka services.

sudo systemctl start zookeeper.service
sudo systemctl start kafka.service

The commands above should start the Zookeeper and Kafka services. You can verify by running the commands:

sudo systemctl status zookeeper.service

The command above should return the status if the Zookeeper service is running. An example output is as shown:

● zookeeper.service - Apache Zookeeper Service
     Loaded: loaded (/etc/systemd/zookeeper.service; enabled; vendor preset: enabled)
     Active: active (running) since Fri 2022-07-15 07:12:17 CDT; 6s ago
       Docs: http://zookeeper.apache.org
   Main PID: 2741 (java)
      Tasks: 28 (limit: 2284)
     Memory: 111.3M
        CPU: 1.465s
     CGroup: /system.slice/zookeeper.service
             └─2741 /usr/lib/jvm/java-11-openjdk-amd64/bin/java -Xmx512M -Xms512M -server -XX:+UseG1GC -XX:MaxGCPauseMillis=20 -XX:In>
Jul 15 07:12:19 debian11 zookeeper-server-start.sh[2741]: [2022-07-15 07:12:19,009] INFO zookeeper.commitLogCount=500 (org.apache.zoo>
Jul 15 07:12:19 debian11 zookeeper-server-start.sh[2741]: [2022-07-15 07:12:19,015] INFO zookeeper.snapshot.compression.method = CHEC>
Jul 15 07:12:19 debian11 zookeeper-server-start.sh[2741]: [2022-07-15 07:12:19,015] INFO Snapshotting: 0x0 to /tmp/zookeeper/version->
Jul 15 07:12:19 debian11 zookeeper-server-start.sh[2741]: [2022-07-15 07:12:19,019] INFO Snapshot loaded in 10 ms, highest zxid is 0x>
Jul 15 07:12:19 debian11 zookeeper-server-start.sh[2741]: [2022-07-15 07:12:19,019] INFO Snapshotting: 0x0 to /tmp/zookeeper/version->
Jul 15 07:12:19 debian11 zookeeper-server-start.sh[2741]: [2022-07-15 07:12:19,019] INFO Snapshot taken in 0 ms (org.apache.zookeeper>
Jul 15 07:12:19 debian11 zookeeper-server-start.sh[2741]: [2022-07-15 07:12:19,067] INFO zookeeper.request_throttler.shutdownTimeout >
Jul 15 07:12:19 debian11 zookeeper-server-start.sh[2741]: [2022-07-15 07:12:19,069] INFO PrepRequestProcessor (sid:0) started, reconf>
Jul 15 07:12:19 debian11 zookeeper-server-start.sh[2741]: [2022-07-15 07:12:19,081] INFO Using checkIntervalMs=60000 maxPerMinute=100>
Jul 15 07:12:19 debian11 zookeeper-server-start.sh[2741]: [2022-07-15 07:12:19,082] INFO ZooKeeper audit is disabled. (org.apache.zoo>
lines 1-21/21 (END)

Form the output above, we can see that the Zookeeper service is running successfully.

To check the Kafka service, run:

sudo systemctl status kafka.service

Similarly, if the Kafka service is running, you should see an output as shown:

● kafka.service - Apache Kafka Server
     Loaded: loaded (/etc/systemd/kafka.service; enabled; vendor preset: enabled)
     Active: active (running) since Fri 2022-07-15 07:14:31 CDT; 5s ago
       Docs: http://kafka.apache.org/documentation.html
   Main PID: 3205 (java)
      Tasks: 69 (limit: 2284)
     Memory: 358.9M
        CPU: 4.879s
     CGroup: /system.slice/kafka.service
             └─3205 /usr/lib/jvm/java-11-openjdk-amd64/bin/java -Xmx1G -Xms1G -server -XX:+UseG1GC -XX:MaxGCPauseMillis=20 -XX:InitiatingHeapOccupancyPercent=35 -XX:+ExplicitGCInvokesConcurrent -XX:MaxInlineLevel=15 -Djava.awt.headless=true -Xlog:gc*:f>
Jul 15 07:14:34 debian11 kafka-server-start.sh[3205]: [2022-07-15 07:14:34,863] INFO [/config/changes-event-process-thread]: Starting (kafka.common.ZkNodeChangeNotificationListener$ChangeEventProcessThread)
Jul 15 07:14:34 debian11 kafka-server-start.sh[3205]: [2022-07-15 07:14:34,909] INFO [SocketServer listenerType=ZK_BROKER, nodeId=0] Starting socket server acceptors and processors (kafka.network.SocketServer)
Jul 15 07:14:34 debian11 kafka-server-start.sh[3205]: [2022-07-15 07:14:34,948] INFO [SocketServer listenerType=ZK_BROKER, nodeId=0] Started data-plane acceptor and processor(s) for endpoint : ListenerName(PLAINTEXT) (kafka.network.SocketServer)
Jul 15 07:14:34 debian11 kafka-server-start.sh[3205]: [2022-07-15 07:14:34,949] INFO [SocketServer listenerType=ZK_BROKER, nodeId=0] Started socket server acceptors and processors (kafka.network.SocketServer)
Jul 15 07:14:34 debian11 kafka-server-start.sh[3205]: [2022-07-15 07:14:34,959] INFO Kafka version: 3.2.0 (org.apache.kafka.common.utils.AppInfoParser)
Jul 15 07:14:34 debian11 kafka-server-start.sh[3205]: [2022-07-15 07:14:34,959] INFO Kafka commitId: 38103ffaa962ef50 (org.apache.kafka.common.utils.AppInfoParser)
Jul 15 07:14:34 debian11 kafka-server-start.sh[3205]: [2022-07-15 07:14:34,959] INFO Kafka startTimeMs: 1657887274949 (org.apache.kafka.common.utils.AppInfoParser)
Jul 15 07:14:34 debian11 kafka-server-start.sh[3205]: [2022-07-15 07:14:34,966] INFO [KafkaServer id=0] started (kafka.server.KafkaServer)
Jul 15 07:14:35 debian11 kafka-server-start.sh[3205]: [2022-07-15 07:14:35,106] INFO [BrokerToControllerChannelManager broker=0 name=forwarding]: Recorded new controller, from now on will use broker localhost:9092 (id: 0 rack: null) (kafka.server.Broke>Jul 15 07:14:35 debian11 kafka-server-start.sh[3205]: [2022-07-15 07:14:35,121] INFO [BrokerToControllerChannelManager broker=0 name=alterPartition]: 

And that’s it. You have successfully installed and enabled the Apache Kafka Server on your Debian system.

Testing Kafka Producer/Consumer

Once you have Kafka Installed, its good to test if you can publish and consume topics.

Kafka Create Topic

Let’s start by creating a Kakfa topic using the kafka-topics script. Run the command:

/opt/kafka/bin/kafka-topics.sh --create --topic GeekBits-Kafka --bootstrap-server localhost:9092

In the example above, we use the kafka topics creator to create a topic called GeekBits-Kafka. We should see an output as shown:

Created topic GeekBits-Kafka.

We can check the detailed information about the created topic by running the command:

/opt/kafka/bin/kafka-topics.sh --describe --topic GeekBits-Kafka --bootstrap-server localhost:9092

The command above should return detailed information about the GeekBits-Kafka topic as shown:

Topic: GeekBits-Kafka   TopicId: dLq5D9brRyCebPXVWpXdKw PartitionCount: 1       ReplicationFactor: 1    Configs: segment.bytes=1073741824
        Topic: GeekBits-Kafka   Partition: 0    Leader: 0       Replicas: 0     Isr: 0

Kafka Write Events to Topic

Once we have a topic, let us see if we can communicate with the Kafka brokers by writing some events into the Kafka topic.

Run the command as show:

$ /opt/kafka/bin/kafka-console-producer.sh --topic GeekBits-Kafka --bootstrap-server localhost:9092
>Hello Geeks from Apache Kafka
>We hope you are enjoying this tutorial
>Thanks for tuning in.
>
>
>You can end the producer client by pressing CTRL+C

The command above will launch the Kafka producer client allowing you to write events to the specified topics. Once done writing your events, press CTRL + C

Kafka Read Events

Once we are done writing some events, we can read them using the consumer client. Open a new terminal session and run the command:

$ /opt/kafka/bin/kafka-console-consumer.sh --topic GeekBits-Kafka --from-beginning --bootstrap-server localhost:9092

The consumer client will start reading the events produced to the GeekBits-Kafka topic. We should start seeing the events we wrote earlier as shown:

Hello Geeks from Apache Kafka
We hope you are enjoying this tutorial
Thanks for tuning in.


You can end the producer client by pressing CTRL+C

You can test out the producer by writing more messages as shown earlier. Once done consuming the events, you can close the consumer client by pressing CTRL + C

Conclusion

And huge congratulations to you!! You have successfully finished the Apache Kafka beginners guide. In this article, you learned how to download and setup your Kafka server, creating and managing services, creating Kafka topics, writing events to a Kafka topic, and how to consume Kafka Topics.

We hope you enjoyed this tutorial. If you did, leave us a comment below and share the tutorial.

If you face any errors with Kafka installation and configuration, feel free to contact us and we will help you out!!!!!

Share
Comments
More from Cloudenv

Cloudenv

Developer Tips, Tricks and Tutorials.

Great! You’ve successfully signed up.

Welcome back! You've successfully signed in.

You've successfully subscribed to Cloudenv.

Success! Check your email for magic link to sign-in.

Success! Your billing info has been updated.

Your billing was not updated.