Red Hat's CKI ("cookie") project started out as a Jenkins proof-of-concept to show that kernel testing can be integrated into the kernel workflow.
Today, CKI provides all the infrastructure that stands between a merge request for a Fedora/CentOS/RHEL kernel repository on gitlab.com and the green check mark that signals that it can be merged.
Join us in our journey of how a bunch of engineers (that couldn't spell DevOps correctly) learned to operate a fast-moving resilient service used by kernel developers in their daily work.
We will talk about:
How we keep the CKI service up and running by - using a state-of-the art logging, monitoring and alerting setup - being persistent in the case of transient failures - reducing single-point-of-failure
How we make it easy to modify the testing pipeline by - running canary pipelines for unmerged code
How we make it safe to hack on the underlying infrastructure by - dividing the backend infrastructure into microservices - automatically deploying testing versions for merge requests - automating production deployments