My 4-node k3s cluster (where this blog is hosted too) kept dying every now and then. Looking at kubectl describe nodes it quickly became evident that this was caused by the nodes running out of disk space. Once a node gets tainted with HasDiskPressure, pods might get evicted and the kubelet will be using (quite a lot of) CPU trying to free disk space by garbage collecting container images and freeing ephemeral storage.

My setup by default uses local storage (the local-path provider) where volumes are actually local folders on the node. This means that pods that use persistence are stuck with the same node forever and can’t just move around. This makes eviction a problem, since they have nowhere else to go. It also means that disk usage is actually disk usage on the node, and not on some block volume over the network.

I’d ssh into the node VPSs and check what’s using disk with ncdu -xq. There were some pretty obvious culprits: loki keeping 3 gigs of logs around and MariaDB using a whopping 10 gigs. I figured two ways to deal with MariaDB’s usage: either split up the server into multiple MariaDB installations that could potentially run on different nodes, instead of focusing on one node where the primary is. The other was to actually use an external block volume for this purpose.

Since I’m already using an external volume for my Seafile installation (and updating the configs of everything that uses MariaDB is a pain in the neck), I decided to go that route. I also took the opportunity to upgrade the MariaDB chart’s major version. I’ve been putting that off for a while because the migration was effort.

The process looked something like this:

  1. use kubectl port-forward and mysqldump to back up the whole thing
  2. update the MariaDB Application in Argo
  3. delete old PV/Cs
  4. import the backups

There were a few unexpected hiccups along the way. First, port-forward kept dropping the connection. While I wasn’t sure what the cause was, using nc as suggested on a forum and dumping each table separately instead of all at once made the blast radius smaller. My dump script was something like this:

read MYSQL_PASS
alias lister='mysql -s -r --skip-column-names -u root -p"$MYSQL_PASS" -h localhost -P 3306 --protocol TCP -e'
for db in $(lister 'select schema_name from information_schema.schemata where schema_name not in ("test", "mysql") and schema_name not like "%schema"'); do
  for table in $(lister "show tables from $db"); do
    mysqldump -u root -p"$MYSQL_PASS" -h localhost -P 3306 --protocol TCP "$db" --tables "$table" -r "$db.$table.sql" &&
      echo "Finished dumping $db/$table" ||
      (echo "Failed dumping $db/$table" && read)
  done
done

The read in the failure branch would pause the execution while I restart the failed port-forward command. Once the backup was all done, it was time to update the manifests and get rid of the old PersistentVolumes. This latter proved a bit tricky, because the PersistentVolumeClaims would get stuck in “Terminating” state not actually getting deleted. A little searching led me to the suggestion to disable the PVC’s finalizers, and while I don’t understand why it’d get stuck in the first place, this did the trick.

With the old PVs gone and (since local-path has the default reclaim policy “Delete”) space freed, the cluster quickly recovered (though everything depending on MariaDB went unhealthy of course) and it was time to import back everything. The import script was just a dumb reverse of the dump:

for f in ./*; do
  db=$(basename "$f" | sed 's/\..*$//')
  mysql -u root -p"$MYSQL_PASS" -h localhost -P 3306 --protocol TCP -e "create database if not exists $db" &&
  mysql -u root -p"$MYSQL_PASS" -h localhost -P 3306 --protocol TCP "$db" < "$f" &&
    echo "Imported $f" ||
    (echo "Failed to import $f" && read)
done

This was the point where I realized that the dump didn’t contain user information, so I had to manually recreate the MariaDB users and grant them their old privileges. (I should really automate that somehow…) Now upgraded and running on a new and shiny MariaDB install, I hope I won’t have to touch the cluster config again for a while.