Usually there was no problem. Stuff worked just fine. Certificates were generated and renewed automatically. https://
links opened without ugly browser warnings about how you’re about to get hacked and it’s the end of the known universe.
But when it wasn’t “usually”, when Traefik just happened to restart for whatever reason, then all of that was obliterated. Since Traefik was running on ephemeral storage, eg nothing was really persisted, innocently tweaking some configuration (that resulted in a restart) could be catastrophic. You know, self-signed certificates and ugly browser warnings.
The alternative is to use cert-manager to deal with and persist (into Kubernetes CRDs) certificates and feed those into Traefik for use. There was still some downtime involved since I’m using the DNS01 challenge to get wildcard certs, and DNS is notoriously slow to propagate.
There were some tricky bits. The cert-manager helm chart uses CRDs, which need to be explicitly allowed to get installed. Then it was the surprise that while Traefik’s built-in acme integration seemed to work with linode’s DNS out of the box, for cert-manager that had to be installed separately as a webhook. Even the official repo took significant trial and error to get working, mostly because the repo didn’t have git refs (tags) set up as expected (see my currently working configuration).
Then I had to create Certificate resources for each of the domains that Traefik auto-managed for me before. This was somewhat annoying since some of my stuff are managed in separate repos (like the static landing page) which had to be separately updated as well. Previously Traefik would generate the certs from its TLS config, but now the individual Ingresses’ TLS config had to reference the secrets generated by cert-manager (example).
I also had to update Traefik’s settings so that it wouldn’t try to use its built-in certificate fetching instead relying on cert-manager. This was again tricky because the documentation wasn’t clear at all how to set it up (even though they recommend doing it this way). Had to specify tls.domains
without specifying certificatesResolvers
(see currently working config).