Deploy Apache Flink cluster on Kubernetes

Click for: original source

When it comes to deploying Apache Flink on Kubernetes, you can do it in two modes, either session cluster or job cluster. A session cluster is a running standalone cluster that can run multiple jobs, while a Job cluster deploys a dedicated cluster for each job. By Elvis David.

Apache Flink is a framework and distributed processing engine for stateful computations over unbounded and bounded data streams. Flink has been designed to run in all common cluster environments, perform computations at in-memory speed and at any scale.

In the article you will find clear advice on session cluster:

  • Deployment object which specifies the JobManager
  • Deployment object which specifies the TaskManagers
  • A service object exposing the JobManager’s REST APIs

The TaskManager can be configured with a certain number of processing slots which gives the ability to execute several tasks at the same time and this is what we call Parallelism.

Following this guide you will create a deployment object which will be used to instantiate our JobManager. This deployment object will create a single JobManager with the container image Flink-1.10.0 for scala and exposes the container ports for RPC communication, blob server, queryable state server and web UI. Files needed for deployment are included. Nice one!

[Read More]

Tags apache devops cloud data-science big-data