CRD Data Architecture for Multi-Cluster Kubernetes

Watch talk on YouTube

Background

CRDs:

Platform: Apacke Spark, Argo, Jupyter Notebooks, …
Tier: Parts of a platform that include access policies, ressource usages and network controls -> e.g. beta or prod
ClusterSet: Shards within a tier (aka availability zone)
Cluster: Part of a Clusterset that can be destroyed/recreated
ComputeNamespace = Namespace + ServiceAccount + LimitRange + ResourceQuota + RBAC
Part of a clusterset, all clusters which are part of the set have the same ComputeNamespace

Goals & Challenges

Scale: 9000 Software Engineers

Challenges:

Scaling
Decomposition

Goal: Manage software platforms on kubernetes via kubernetes utilizing opterators

KEPs by the SIG-MultiCluster

Cluster Profiles

Name
Manager
Status
- K8S Version
- Conditions (Helath)
Cluster Access, options
- Work API (another proposal)
- Push via OIDC
- Push with Secret
- Zertificate Auth

ClusterSet

Within a ClusterSet namespace sameness applies -> All namespaces are the same in all clusters of a set
Mutation = Delete and recreate cluster

Clusternames

Unique Name
Valid RFC 1123 DNS label

Cluster Inventory

All ClusterProfiles should reside in a dedicated hub clusters

TODO: Diagram

HA

They use Kine (by k3s) to shim etcd to postgres

Referential Integrity

The CRDs all refer to each other (e.g. Tier –> Platform)
Solution: CEL Expressions combined with webhooks and operators for business logic validation

Resource Creation

They have a simple api that is just a kubeapi wrapper

TODO: Diagram

Q&A

Why does everyone build their own multicluster stuff instead of utilizing open soruce
- Threir solution predates SIG-Multicluster
- They are using some open source solutions like karmada
Could you explain ClusterProvile<->ClusterInventory again: He did, see livestream
Where does your postgres run (does it run on the same kubernetes it shims)?
- There are no cross-dependencies
- The managment-clusters are lightweight
Are you running a real kubernetes cluster for the hub?
- Nope we just use the apiserver