kivikakk.ee

cluster overview

I should probably add some distinction between dev and prod mode for next time, oops.

Here follows a short overview of the current state of the Kubernetes cluster that constitutes my infrastructure. I link to the Flux bits and pieces that install and configure them. It’s more personal documentation than anything useful for others.

$ k gns
NAME STATUS AGE
cert-manager Active 105d
chog Active 105d
default Active 105d
enbi Active 105d
external-dns Active 105d
flux-system Active 105d
furpoll Active 105d
ingress-nginx Active 105d
kube-node-lease Active 105d
kube-public Active 105d
kube-system Active 105d
kv Active 105d
linkding Active 105d
miniflux Active 105d
minio-kala Active 105d
minio-operator Active 105d
nossa Active 105d
outline Active 105d
postgres-cassax Active 105d
shynet Active 105d
static Active 105d
Read more

enbi: fully operational!

Update on enbi (source) — it now monitors Pods with annotations describing which flake to build and what tag that build should produce. When it notices one failing to start due to a missing image matching the annotations, it creates a NixBuild matching the requirements, which in turn runs the build and loads it into the cluster! Successful builds clean up after themselves, though I’m leaving around the NixBuild objects themselves for now. Failing builds leave the Job/Pod in place for troubleshooting.

Updating the version of one of my apps which use my standard pattern for building Docker images with Nix is now just a matter of changing the tag in one place (e.g.); the cluster figures out building it and moving to the new release without downtime.

This has been a fun one week sojourn into writing Kubernetes operators :) The API is pretty neat, controller-runtime feels clean, and it was enjoyable discovering how many assumptions I had to unlearn while negotiating where the controller was running, where its jobs were to be scheduled, how to move data around, and the like.

nix build → nb → enbi

Still pre-alpha, but tonight I got the first complete run of a little Kubernetes controller I’ve been wanting!

Screenshot of a terminal, showing a Nix build in progress. At the top of the screen a Kubernetes CRD called “nixbuild.enbi.hrzn.ee” is visible, and at the bottom, the Nix build process can be seen producing a layered Docker image, which is then imported.

Wahoo yipee etc.! Right now we have a CRD which triggers a Nix build of a given flake URL, expected to produce a Docker or OCI image — it chooses a node which can build for the target system, spawns a Job which builds the target, and then imports it into the node’s container registry. We assume that something like Spegel is running and so any node that needs the image will pick it up.

The “hard” part (other than writing directly against the k8s API for the first time) was getting the Nix stuff to work well vis-à-vis building in a container while caching everything nicely — the flakes themselves, as well as whatever ends up in the store, as much of it will be reused between versions. Thankfully all the tooling is Cool As Fuck and it was actually really easy. We create a locally-provisioned PersistentVolume per node and stuff $HOME/.cache/nix and the Nix store in there. For now we use a chroot store, but I’d like to try an overlay store in future to avoid potentially duplicating whatever comes along in the nixos/nix image. Importing into the node’s container store is as simple as mounting the host /run and locating containerd’s socket — it differs depending on your k8s distro, and I’m developing on kind while deploying to k3s.

I still have to clean it up in this state, and have plans after this to remove the CustomResourceDefinition and trigger builds automatically when needed, getting the source details from annotations on the Deployment, but I’m happy. I don’t particularly like manually executing builds, nor do I want to stand up a registry and pre-build everything. My cluster runs on two architectures, but whether any given revision of an application will actually ever run on either, both, or any(!) of those is a matter of the particular scheduling constraints for the application and the state of the cluster at any given moment. Rather than waste energy pre-building and storing, let’s build on-demand instead! 💛🤍💜🖤

jackalgirls & CUE

Today it was finally time to write a policy file for one of my Anubis instances. I use Timoni as a fairly thin wrapper over CUE to write templates for my own k8s deployments, and I found it really shone in this particular instance. I’ll just tl;dr and show the code; here’s an excerpt from my blog engine’s bundle.cue, which is the “entrypoint” for compiling its manifests:

anubis: {
secretName: "anubis-20250816-071240"
policy: permitPaths: [{
name: "permit-atom-xml"
path_regex: "^/atom\\.xml$"
}, {
name: "permit-feed-xml"
path_regex: "^/feed\\.xml$"
}]
}

I’m aiming to expose just a minimum of configurability first. Here’s how the schema side of that is defined in config.cue:

anubis?: {
// Needs to already exist in the target namespace. Should have key
// "ED25519_PRIVATE_KEY_HEX".
secretName: string
policy?: {
permitPaths: *[] | [... close({
name: string
path_regex: string
})]
}
}

I grabbed the default root bot policy file from https://github.com/TecharoHQ/anubis/blob/main/data/botPolicies.yaml, and converted it to CUE with cue import botPolicies.yaml. Then we put it in the templates package, add a way to inject our config, and use the config to expand upon the defaults:

package templates
#AnubisBotPolicies: {
#config: #Config
//# Anubis has the ability to let you import snippets of configuration into the main
//# configuration file. This allows you to break up your config into smaller parts
//# that get logically assembled into one big file.
// ...
}, if #config.kv.anubis.policy.permitPaths != _|_ for setting in #config.kv.anubis.policy.permitPaths {
name: setting.name
path_regex: setting.path_regex
action: "ALLOW"
}, {
// ...

Finally, the bit I really like: creating the ConfigMap (which gets mounted as a volume) with the policy YAML:

#AnubisConfigMap: timoniv1.#ImmutableConfig & {
Config=#config: #Config
#Kind: timoniv1.#ConfigMapKind
#Meta: #config.metadata
#Suffix: "-anubis-env"
#Data: {
"policy.yml": yaml.Marshal(#AnubisBotPolicies & {#config: Config})
}
}

Note the careful lack of hand-written YAML at any stage! 💛🤍💜🖤

the Anubis character, by CELPHASE

just keep looking deeper! the answer is in there!

La herramienta del día es nftrace. I couldn’t work out why some pods weren’t able to communicate with each other across the Tailscale mesh. Suspected ACLs, suspected routes weren’t getting installed correctly (p.s. ip route show table 52 (!?)), suspected local firewalls, suspected so much. tcpdump only gets you so far.

Finally, on the target node:

$ doas -s
# nix shell nixpkgs#nftrace nixpkgs#nftables
# nftrace add ip daddr 10.59.1.213
# nftrace monitor

Try the request that isn’t making it through a bunch of times until you can isolate the exact sequence. ^C, nftrace remove, and read carefully:

trace id daac839a inet nftrace-table nftrace-chain packet: iif "tailscale0" ip saddr
100.67.157.26 ip daddr 10.59.1.213 ip dscp cs0 ip ecn not-ect ip ttl 64 ip id 32261
ip protocol tcp ip length 60 tcp sport 33233 tcp dport 9090 tcp flags == syn tcp
window 64480
trace id daac839a inet nftrace-table nftrace-chain rule ip daddr 10.59.1.213 meta nftrace
set 1 (verdict continue)
trace id daac839a inet nftrace-table nftrace-chain policy accept
trace id daac839a ip filter FORWARD packet: iif "tailscale0" oif "cni0" ip saddr 100.67.157.26
ip daddr 10.59.1.213 ip dscp cs0 ip ecn not-ect ip ttl 63 ip id 32261 ip length 60 tcp
sport 33233 tcp dport 9090 tcp flags == syn tcp window 64480
trace id daac839a ip filter FORWARD rule counter packets 44827 bytes 28768164 jump
KUBE-ROUTER-FORWARD (verdict jump KUBE-ROUTER-FORWARD)
trace id daac839a ip filter KUBE-ROUTER-FORWARD rule ip daddr 10.59.1.213 counter packets
5001 bytes 6279235 jump KUBE-POD-FW-FIAOHC4WHRKERAQ6 (verdict jump
KUBE-POD-FW-FIAOHC4WHRKERAQ6)
trace id daac839a ip filter KUBE-POD-FW-FIAOHC4WHRKERAQ6 rule counter packets 5 bytes 300
jump KUBE-NWPLCY-ZYSQVVSY5LQY7Q46 (verdict jump KUBE-NWPLCY-ZYSQVVSY5LQY7Q46)
trace id daac839a ip filter KUBE-NWPLCY-ZYSQVVSY5LQY7Q46 rule limit rate 10/minute burst 10
packets meta mark & 0x00010000 != 0x00010000 counter packets 5 bytes 300 log prefix
"DROP by policy monitoring/prometheus-k8s" group 100 (verdict continue)
trace id daac839a ip filter KUBE-POD-FW-FIAOHC4WHRKERAQ6 rule meta mark & 0x00010000 !=
0x00010000 limit rate 10/minute burst 10 packets counter packets 5 bytes 300 log group
100 (verdict continue)
trace id daac839a ip filter KUBE-POD-FW-FIAOHC4WHRKERAQ6 rule meta mark & 0x00010000 !=
0x00010000 counter packets 5 bytes 300 reject (verdict drop)

What’s that? log prefix "DROP by policy monitoring/prometheus-k8s"?? Guuaaaaauuuuu.