Don't write controllers like charlie don't does: Avoiding common kubernetes controller mistakes

Slides

Common mistake

Not using a simple client but directly talk to the api server

  • Problem: A
  • Problem: Updates send in the whole object -> Noop updates waste apiserver resources
  • Fix: Use a cache client
  • Problem: Caching validation

Don’t use custom caching

  • Problem: Good Luck dealing with concurrency
  • Hard: Controllers mus maintain a per kind cache
  • Problem: Eventual consistency makes everything more complicated
  • Fix: Use a framework

Predecates only apply to the current

  • If you have a predecate in the for (predecate) only appy to this call, not to other watchers
  • Also check if you shold be reconciling your low-level object or reconciling the higher level ones that ref to them is better

Tools

KRT

Still under development

  • Operatorions in collections (kubernetes objects with state tracking)
  • Fetch function that handels transformation

StateDB

  • In-memory database for go with watch channels
  • You can setup a table that stores all objects of a kind (provided by the client)
  • Triggers hooks when changes happen in the database that you can react to

Controller-Runtime

The kubebuilder one

  • Includes a chached client
  • Works on the reconciler pattern -> Makes triggers simpe

Tips

  • Limit the number of api server updates
    • Check for dif yourself and don’t send updates if there is nothing new
    • Use patch instead of update just with changed fields -> Especially for .status
  • Use a framework that handles watching, coalescing and caching (krt, statedb, controller-runtime)
  • Use predecates if you’re using controller-runtime, this helps you filter out no-op events by checking them against the cache and filters

Q&A

  • Do you know where your reconciliations are coming from:
    • Counts: Yes the frameworks provide metrics and you can implement your own
    • But controller runtime abstracts the patch source so you have to compare before and after state yourself - but you should not do that
  • What about state sharing across multiple threads?
    • Controller runtime handels each reconcile as idempotent, so you can just multithread
    • But handling consistency can still be hard because you have to design all of your operations as idempotent by rebuilding the state each time
  • What are your thoughts on controllers that do stuff in the real world (especially b/c it takes longer and there are no natie observers)
    • Do something like the krt project by keeping the state seperatly
  • What if someone changes things at the cloud provider
    • A question of philosophy -> Usually just treat the operator at the source of throuth
  • How do you test your operators?
    • Depends on your output (kubernetes objects make stuf simple)
    • For cilium: Simple b/c it’s just creating kubernetes projects
    • With oputside interaction: In-memory state representation or mocking
    • For complex controllers split the operator into: Ingestion, data model and transformation