DevConf.cz 2021 has ended
Back To Schedule
Saturday, February 20 • 2:00pm - 2:25pm
Beyond k8s Monitoring: Data to Knowledge to Action

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Feedback form is now closed.

Since a typical kubernetes cluster consists of a lot of moving parts, there are many ways in which it could break. Therefore monitoring tools such as prometheus are often used to collect usage and health metrics from deployments. However, when there are thousands of deployments in your fleet, inspecting health metrics from individual deployments to diagnose the issues in them becomes tedious and inefficient.

In this talk, we will talk about how we applied data science to the health metrics collected from OpenShift clusters to help us proactively identify issues. Specifically, we used clustering to form groups of deployments that behave similarly. Then, we applied frequent pattern mining to determine the prominent, 'defining' patterns in each group. These patterns can help us precisely identify and codify the problem affecting the deployments. In this way, we can diagnose issues proactively and scalably. We found that in many cases, the patterns determined by these methods coincide very well with the rules developed by SMEs. Therefore, we believe these techniques can be used to generate actionable insights going forward if added as an extension to your existing monitoring system.

Slides: https://slides.com/chauhankaranraj/beyond-k8s-monitoring


Karanraj Chauhan

Software Engineer, Red Hat
I like math, machine learning, and deep learning. Big fan of CPUs, GPUs, FPGAs, and other such lightning powered stones.

Saturday February 20, 2021 2:00pm - 2:25pm CET
Session Room 5