Galea Augustin: Deploying Flink jobs in production using the Flink Kubernetes operator: Part 1

As stream processing is on the rise, it is not uncommon for organizations to write hundreds of Flink jobs to cater to their business needs. Since Flink jobs have great diversity in their configuration, data processing scale, and resourcing needs, a reliable and consistent deployment strategy across all the jobs is ideal. This post is the first of the two articles where we will discuss deploying the Flink Kubernetes Operator itself. The following post can then build up on this one and talk about deploying Flink jobs using Flink Operator.

Diogo Santos has an amazing article introducing Flink Kubernetes Operator and a hands-on guide.

This is a great first read if you are new to Flink or Flink Kubernetes Operator.

Helm or not to Helm?

The official recommendation is to use helm to deploy Flink Operator on your K8s cluster.

In most cases, this works fine but there could be cases where you might need to go a manual route.

If you are integrating the deployment with your organization’s CICD framework which does not support helm.If you need to customize your operator deployment, for example, to allow for a Prometheus sidecar. Since this article is targeting a production-level deployment for Flink Operator, it makes sense to add support for a Prometheus sidecar. This is to allow for capturing Flink operator metrics and also to serve as a reference for adding any sidecar as per your need.

Gathering pieces

We need to collect all the artifacts which would allow us to bypass using helm and manually deploy them after customizing them as per our needs. A good starting point would be downloading the Flink Operator artifacts from its official release page.

C;\projects> git clone https://github.com/apache/flink-kubernetes-operator.git

You might be delighted after taking a look at helm/flink-kubernetes-operator and identifying the pieces that form the entirety of Flink Operator deployment. If it is still overwhelming, do not worry, we will see how to use the artifacts.

Customize and Deploy

Let’s customize and deploy the artifacts one by one. The notable ones are:

service accounts and roles.
flinkdeployments and flinksessionjob CRDs.
flink-conf.yaml which contains configurations for the Flink operator.
flink-operator.yaml, which is the deployment spec for the Flink operator itself.
SAs and Roles for Flink Operator

Flink Operator uses role-based access control to manage the flinkdeployments, and create and manage the JobManager deployment, services, and config map, among others. We will create a role.yaml and sa.yaml.

role.yaml

---
apiVersion: v1
kind: ServiceAccount
metadata:
name: flinkoperator
namespace: flinkoperator
labels:
deploy.artifact.io/name: 'flinkoperator'

sa.yaml

---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: flinkoperator
namespace: flinkoperator
labels:
deploy.artifact.io/name: 'flinkoperator'
rules:
- apiGroups:
- ""
resources:
- pods
- services
- events
- configmaps
- secrets
verbs:
- "*"
- apiGroups:
- apps
resources:
- deployments
- deployments/finalizers
- replicasets
verbs:
- "*"
- apiGroups:
- extensions
resources:
- deployments
- ingresses
verbs:
- "*"
- apiGroups:
- flink.apache.org
resources:
- flinkdeployments
- flinkdeployments/status
- flinksessionjobs
- flinksessionjobs/status
verbs:
- "*"
- apiGroups:
- networking.k8s.io
resources:
- ingresses
verbs:
- "*"
- apiGroups:
- coordination.k8s.io
resources:
- leases
verbs:
- "*"
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: flinkoperator
namespace: flinkoperator
labels:
deploy.artifact.io/name: 'flinkoperator'
roleRef:
kind: ClusterRole
name: flinkoperator
apiGroup: rbac.authorization.k8s.io
subjects:
- kind: ServiceAccount
name: flinkoperator
namespace: flinkoperator

kubectl apply --filename sa.yaml -n flinkoperator
kubectl apply --filename role.yaml -n flinkoperator

CRDs

CRDs should not be customized and be deployed as it is.

kubectl replace --filename helm/flink-kubernetes-operator/crds/flinkdeployments.flink.apache.org-v1.yml -n flinkoperator
kubectl replace --filename helm/flink-kubernetes-operator/crds/flinksessionjobs.flink.apache.org-v1.yml -n flinkoperator

Flink config

We will create a config map configmap.yaml and merge the flink config and logging

configs together. Note that log4j-console.properties affects logging related to

job and user code whereas log4j-operator.properties affects Flink Operator logs.

Galea Augustin

10/31/2024

Deploying Flink jobs in production using the Flink Kubernetes operator: Part 1

Niciun comentariu:

QUARKUS & GraphQL

Persoane interesate