Flink is one of the most powerful open source distributed processing engines. The ease to integrate it with popular data platforms and applications like Kafka, Elastic Search and Cassandra, has given Flink a unique place in the current data engineering and data streaming space.
One of the most common challenges Flink users are facing, concerns its performance metrics. Even though there is an Admin UI provided by Flink, most organisations tend not to use it because of its many limitations in key areas like user authentication, dashboards and alerts.
An alternative solution for organisations to ensure more advanced security, more detailed dashboards and more features, is Graphite. Graphite is an enterprise ready monitoring tool that makes time-series data metrics easier to store, retrieve, share, and visualise.
In this blog we show you the steps required to integrate Flink with Graphite.
1. Start Graphite
Install and start Graphite. We have used docker image to run the Graphite container.
2. Start Flink 1.11
Install and start Apache Flink 1.11. For the sake of simplicity the Flink cluster is started in the standalone mode with a single job manager and one task manager.
3. Configure Graphite Metrics Reporter:
Go to Flink installation dir/conf folder and edit flink-conf.yaml. Add the below lines for the metrics section:
4. Deploy the Flink streaming app from the command
The Flink streaming app should be up and running. The app is now streaming data from a Kafka source topic and sending transformed data to output Kafka topic.
5. Refresh Graphite Dashboard to check if Flink Job metrics
Refresh Graphite dashboard and now we can see under task manager the Flink job (FlinkDemoJob in our case). For each of the operators for the Flink job metrics can be seen. Visualize one the metrics in the Graphite composer.
Below is one of the examples provided:
Other useful links: