closeup photography of brown analog scale

Scaleway Kapsule and Rancher-managed Hetzner Kubernetes clusters

I honestly think this is the 3rd time I’m moving things around. First I was on Proxmox managing myself a cluster of bare-metal Hetzner (from the server auction page). Then I was torn between a home server and one on Hetzner. I was using many images from the TurnKey Linux project. Nice, interesting but required immense investments of time.

Then I decided I wanted managed Kubernetes services. Not long ago (half-a-month) did I went over to Digital Ocean’s managed Kubernetes service. For not a small price, got a 3 node cluster with 24GB of RAM and 12 cores. Started installing my stuff and quickly ran out of resources only to be forced to pay more.

I knew Rancher before starting these and it was an option to have Rancher installed, by hand, and then have it deploy other Kubernetes clusters but … that seemed a lot of work. One of the ideas was to get the smallest/cheapest Digital Ocean Kubernetes cluster and run Rancher HA on that. The smallest production-grade is a 2.5GB of RAM/node and that’s insufficient.

So here comes the final solution:

  • got myself a Scaleway 4GB of RAM/node cluster, 3x does exclusively dedicated to Rancher and with a subdomain attached to it. Deploying Rancher on HA is extremely easy following the official docs;
  • then hook-up the Hetzner Node driver in Rancher UI, a one-time thing you do manually, configuring a few node templates as required;
  • then ask Rancher to connect over to Hetzner and deploy a 3x control-place (etcd/masters) + 3 workers (8 cores/32GB of RAM);
  • finally, move everything over from Digital Ocean and destroy the Kubernetes cluster there;

Final cost around 120$ for 4x times the resources with all the infrastructure (including this blog) managed by a combination of UI (monitoring, alerting and all the Rancher nice stuff) and simple BASH scripts that effectively run HELM commands with pre-set values and secrets (secrets managed through Mozilla’s SOPS, using a GPG key and committed to GitLab, encrypted). The money is supported by the stateless API I run for a side-business my family runs, so in short, it’s almost free.

For the interested, I used to manage an 154$ cluster on Proxmox over Hetzner bare-metal and I’ve been fighting prices down ever since while still trying to maintain the same set of resources at a fraction of the cost to maintain it.

Why this set-up? Well, I trust Scaleway to be able to maintain a decent uptime for the Rancher cluster. I have the cluster that Rancher created doing etcd snapshots to Digital Ocean spaces (because Scaleway Object Storage has been having errors in some S3 implementations). Because I trust Scaleway to manage an HA Kapsule deployment, the Rancher inside should be highly-available for the foreseeable future (years I hope).

If Rancher is up and being my control center, then I can treat and I do treat all Hetzner deployments as ephemeral. This blog is backed-up daily and all others are either temporary data I can loose, recreate from code (eg. containers with my applications inside) or just ephemeral deployments that I don’t care if they’re lost.

Since I don’t have stringent 99.99% up-time requirements, even all other websites or projects in the presence of MySQL Operator with back-up schedules set to hourly or daily to an off-site S3 compatible bucket helps in treating everything as replaceable, ephemeral.

Oh … and I moved in 3h, manually and with taking coffee breaks. Imagine if all my scripts were in a CI/CD pipeline to automatically deploy everything? Not yet there but aiming to reach that nirvana soon …