I need to use docker stack deploy ... to start a SWAG container in a production environment. It fails due to the certbot timing out while waiting to validate. The docker-compose.yml file I’m using works if I use docker-compose up. And, docker stack deploy works if the config already has certs in it. So, it seems like there’s some issue with the SWAG image’s certbot trying to run in the Docker Stack environment.
Unfortunately I don’t use docker swarm so I can’t provide any insight to that but is there any reason you can’t use the DNS validation that is suggested?.
Yes, various reasons, but primarily there are several top-level domains involved and most of their organizations are not (yet) supporting certbot DNS automation.
So the issue has to do with opaque differences between certbot behavior in docker-compose up and docker stack deploy. As specifically as I can determine in my case, when it’s started by docker-compose up, the certbot uses the domain’s ipv4 address for validation, but when it’s tarted by docker stack deploy the certbot uses the domain’s ipv6 address. I can see the difference by using netstat -an while the newly-started container is waiting for validation. Under docker-compose the certbot is listening on ipv4 port 0.0.0.0:80, but under docker stack deploy it is listening on :::80. The certbot’s different behavior points to a separate issue that causes the validation failure, and that issue is that the Docker daemon is not correctly configuring ipv6 forwarding to containers…
I temporarily solved the problem by removing the DNS entry that provides the host’s ipv6 address.
As @aptalca suspected in then in the original ticket, that docker stack deploy isn’t doing the port forwarding correctly which is causing the validation to fail. Good job on finding that.
Unfortunately we don’t test our containers on docker swarm (or k8s for that matter) nor does anyone use it within the team so it’s hard for us to support.
I haven’t seen anything that confirms my conclusion that certbot behaves differently in the docker-compose up world vs docker stack deploy world, but given the totally different implementations of the two worlds, it wouldn’t surprise me. Given the much deeper integration of docker stack into the Docker engine, I would encourage you all to look at having more explicit swarm testing and support.
As and when we start looking at difference technologies I suspect we will lean towards k8s rather than docker swarm. A few people within the team are already familiar with it so it would be only natural to progress on to that.
I’m learning the hard way that ipv6 support in Docker is still a bit of a mess (though way better than it was).
In case anyone stumbles here with similar problems, there is a Docker image that makes ipv6 behave like ipv4 (NAT-centric), which is a simple solution. The README has a lot of links to very good discussions about the matter, and overall it acknowledges that this simple solution is not ideal, but is a practical approach that works today. GitHub - robbertkl/docker-ipv6nat: Extend Docker with IPv6 NAT, similar to IPv4