10/14/2024

Flink + Docker + Kafka

 Apache Flink is a powerful stream processing framework that enables real-time data processing. Docker provides an easy way to set up and experiment with Apache Flink locally. In this article, we'll guide you through running Apache Flink with Docker, demonstrate how to integrate Apache Kafka with Flink using a Dockerfile, and provide an example Flink script using Python for stream processing.

Setting Up Apache Flink with Docker

Step 1: Install Docker
If Docker is not installed on your system, you can follow the instructions in the [official documentation](https://docs.docker.com/get-docker/) to install it.

Step 2: Run Apache Flink Container
Run the following command in your terminal to start an Apache Flink container:

docker run -d -p 8081:8081 apache/flink:1.20.0

This will pull the Apache Flink image and start a container with the Flink web dashboard accessible at `http://localhost:8081`.

Dockerfile for Apache Kafka and Flink Integration

Step 1: Create Dockerfile
Create a `Dockerfile` in a directory of your choice with the following content:


FROM flink:1.20.0
# Install Kafka connector dependencies
RUN mkdir -p /opt/flink/usrlib
RUN wget -P /opt/flink/usrlib https://repo.maven.apache.org/maven2/org/apache/flink/flink-connector-kafka_2.12/1.14.0/flink-connector-kafka_2.12-1.14.0.jar

Step 2: Build and Run Docker Image
Navigate to the directory containing the `Dockerfile` and run the following commands:

docker build -t flink-kafka-integration .
docker run -it flink-kafka-integration

This will build the Docker image and start a container with Apache Flink and the Kafka connector integrated.


How I Dockerized Apache Flink, Kafka, and PostgreSQL for Real-Time Data Streaming | by Augusto de Nevrezé | Towards Data Science


Apache Flink with Apache Kafka | petitviolet blog



https://hub.docker.com/r/apache/kafka


How to getting started with Apache Kafka on Docker

 Apache Kafka, a distributed streaming platform, is a powerful tool for building real-time data pipelines and streaming applications. Docker simplifies the process of setting up Kafka locally, allowing developers to experiment and develop with ease. In this article, we’ll walk through the steps to download, install, and run Apache Kafka using Docker, providing a hands-on implementation for beginners.


Prerequisites:

Before we begin, ensure that you have Docker installed on your local machine. You can download and install Docker from the official Docker website.


Step 1: Create a Docker Compose File

Create a docker-compose.yml file in a new directory to define the Kafka and Zookeeper services. Copy and paste the following content:

version: '2'

services:
zookeeper:
image: wurstmeister/zookeeper:latest
ports:
- "2181:2181"

kafka:
image: wurstmeister/kafka:latest
ports:
- "9092:9092"
expose:
- "9093"
environment:
KAFKA_ADVERTISED_LISTENERS: INSIDE://kafka:9093,OUTSIDE://localhost:9092
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: INSIDE:PLAINTEXT,OUTSIDE:PLAINTEXT
KAFKA_LISTENERS: INSIDE://0.0.0.0:9093,OUTSIDE://0.0.0.0:9092
KAFKA_INTER_BROKER_LISTENER_NAME: INSIDE
KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
KAFKA_CREATE_TOPICS: "telegram-topic:1:1"
volumes:
- /var/run/docker.sock:/var/run/docker.sock

This docker-compose.yml file defines two services: zookeeper and kafka. The Kafka service is configured to expose ports 9092 and 9093.

Step 2: Run Docker Compose

Open a terminal in the directory where the docker-compose.yml file is located and run the following command to start the Kafka and Zookeeper containers:

docker-compose up -d

This command will download the required Docker images and start the Kafka and Zookeeper services in detached mode (-d).

Step 3: Verify Kafka Container is Running

Check if the Kafka container is running by executing the following command:

docker ps

You should see containers for both Kafka and Zookeeper in the list.

Step 4: Create a Kafka Topic

Create a Kafka topic using the following command:

docker exec -it <kafka-container-id> /opt/kafka/bin/kafka-topics.sh --create --zookeeper zookeeper:2181 --replication-factor 1 --partitions 1 --topic telegram-in-topic

and

kafka/bin/kafka-topics.sh --create --zookeeper zookeeper:2181 --replication-factor 1 --partitions 1 --topic telegram-out-topic



Note: Replace <kafka-container-id> with the actual container ID of the Kafka container (you can find it using docker ps).

Step 5: Produce and Consume Messages

Use the Kafka console producer and consumer to test your Kafka setup:

Produce messages:

docker exec -it <kafka-container-id> /opt/kafka/bin/kafka-console-producer.sh --broker-list localhost:9092 --topic telegram-out-topic

Consume messages:

docker exec -it <kafka-container-id> /opt/kafka/bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic telegram-in-topic --from-beginning

Replace <kafka-container-id> with the actual container ID of the Kafka container.

Step 6: Stop and Remove Containers

To stop and remove the Kafka and Zookeeper containers, run:

docker-compose down

This will stop and remove the containers created by the docker-compose up command.

Conclusion:

Congratulations! You’ve successfully set up and run Apache Kafka on Docker locally. This hands-on guide provides a simple yet powerful environment for experimenting with Kafka. As you continue to explore Kafka, consider integrating it into your applications to harness its capabilities for building real-time data pipelines and streaming applications.










QUARKUS & GraphQL

 QUARKUS & GraphQL https://www.geeksforgeeks.org/graphql-tutorial/ https://quarkus.io/guides/smallrye-graphql-client https://www.mastert...