Version control in the sense that I haven’t made any Ansible (or similar) scripts to deal with this, so I have to keep writing blog posts not to forget what I did and managed.

For ages now I’ve had Apache crashing (at least so it seemed) at regular intervals. I had no idea what was causing it, so I just added a crontab entry to (try to) start the Apache service every 5 minutes. That solved the issue of 502 Bad Gateway errors, but not the cause for them.

I figured it had to do something with Certbot. The error logs for Apache didn’t say much, but the little they did were related to Certbot. I figured it had some scheduled thing running that would attempt to renew my certificates, which would fail (of course) because of my “unusual” setup of Apache behind an nginx reverse proxy. It seemed like it tweaked some config files, then issued an Apache reload order, which collided with nginx’s ports, thus Apache failed to start (resulting in the “crashing” I noticed) and the renewal failed.

The problem was that I had no idea where such a schedule would be. It was nowhere to be seen. There was no certbot service (as far as I could see), nor anything in crontab. I almost gave up, when I ran into a blog post describing just what I was struggling with. That’s how I found out about the existence of Certbot’s inner timer (certbot.timer) thing (which, indeed, had “twice a day” set) and its service.

Just as the author of the post suggests, I immediately issued systemctl disable for both of them (then rebooted for the changes to take effect) and that solved the root cause of my problem.

In a very similar way, I set up a crontab script for renewal too. It’s just a certbot renew, with the --pre-hook and --post-hook options pointing to separate bash scripts I wrote to shut down my usual servers and bring up the temporary ones for cert renewal, then back. By adding the scripts to those hooks, they are only run when there is actually something to be renewed. This way I don’t have to manually check for cert expiration, as Certbot takes care of that.

One of the important things is that Certbot keeps adding 443 back to Apache’s ports.conf, which causes the port collision with nginx and the errors. I solved this by using sed to comment out any active Listen 443 lines in ports.conf. To the –post-hook script I added the following: sed -i -E "s/^[^#]?(\s*Listen 443)/#\1/g" /etc/apache2/ports.conf.

The scripts are a pain in the neck. Adding a new domain/vhost means first shutting down everything, making a new Apache vhost for the new domain, doing a certbot run -d new.domain, disabling the unneeded vhosts (as Certbot generates one more for port 443), and creating a proxied vhost that has the correct cert files etc all set. Not to mention I have to then add the new domain’s (raw and proxied) vhosts to the renew hook scripts manually. Needless to say, it’s not something I’d like to regularly.

Next up would be a script that could do the following:

  • add new domains (all with separate vhosts, ports to listen on and web root folders)
  • handle the renew process (shut down nginx and the proxied vhosts, bring up the raw vhosts, then back once the renew is done)

I think I’ll get it done over a weekend the next time I need to set up a new domain.