`docker stack deploy` fails to start SWAG

hughsw · 15 February 2021 11:09

I need to use docker stack deploy ... to start a SWAG container in a production environment. It fails due to the certbot timing out while waiting to validate. The docker-compose.yml file I’m using works if I use docker-compose up. And, docker stack deploy works if the config already has certs in it. So, it seems like there’s some issue with the SWAG image’s certbot trying to run in the Docker Stack environment.

I filed an issue here: `docker stack deploy` certbot fails to validate certs · Issue #88 · linuxserver/docker-swag · GitHub
The issue was summarily closed claiming that Docker Swarm port forwarding is the problem, even though the image can run in the swarm and both ports are being forwarded. Could someone please provide insight or re-open the issue.

Thanks.

(Posting here in General because Discourse won’t let newbie me post in LSIO Discussion)

j0nnymoe · 15 February 2021 11:23

Unfortunately I don’t use docker swarm so I can’t provide any insight to that but is there any reason you can’t use the DNS validation that is suggested?.

hughsw · 15 February 2021 11:37

Yes, various reasons, but primarily there are several top-level domains involved and most of their organizations are not (yet) supporting certbot DNS automation.

hughsw · 16 February 2021 02:10

So the issue has to do with opaque differences between certbot behavior in docker-compose up and docker stack deploy. As specifically as I can determine in my case, when it’s started by docker-compose up, the certbot uses the domain’s ipv4 address for validation, but when it’s tarted by docker stack deploy the certbot uses the domain’s ipv6 address. I can see the difference by using netstat -an while the newly-started container is waiting for validation. Under docker-compose the certbot is listening on ipv4 port 0.0.0.0:80, but under docker stack deploy it is listening on :::80. The certbot’s different behavior points to a separate issue that causes the validation failure, and that issue is that the Docker daemon is not correctly configuring ipv6 forwarding to containers…

I temporarily solved the problem by removing the DNS entry that provides the host’s ipv6 address.

j0nnymoe · 16 February 2021 08:16

As @aptalca suspected in then in the original ticket, that docker stack deploy isn’t doing the port forwarding correctly which is causing the validation to fail. Good job on finding that.

Unfortunately we don’t test our containers on docker swarm (or k8s for that matter) nor does anyone use it within the team so it’s hard for us to support.

hughsw · 16 February 2021 13:36

Ouch! swarm and k8s is where things are moving!

I haven’t seen anything that confirms my conclusion that certbot behaves differently in the docker-compose up world vs docker stack deploy world, but given the totally different implementations of the two worlds, it wouldn’t surprise me. Given the much deeper integration of docker stack into the Docker engine, I would encourage you all to look at having more explicit swarm testing and support.

j0nnymoe · 16 February 2021 14:09

As and when we start looking at difference technologies I suspect we will lean towards k8s rather than docker swarm. A few people within the team are already familiar with it so it would be only natural to progress on to that.

Roxedus · 16 February 2021 14:15

Pretty sure this all stem to the fact that swarm is giving your containers ipv6 to begin with, while a stand-alone setup does not.

As for SWAG, it does not try to replace the ingress functions in either swarm or k*s, it’s just not suited to run in these environments.

hughsw · 16 February 2021 14:50

I’m learning the hard way that ipv6 support in Docker is still a bit of a mess (though way better than it was).

In case anyone stumbles here with similar problems, there is a Docker image that makes ipv6 behave like ipv4 (NAT-centric), which is a simple solution. The README has a lot of links to very good discussions about the matter, and overall it acknowledges that this simple solution is not ideal, but is a practical approach that works today. GitHub - robbertkl/docker-ipv6nat: Extend Docker with IPv6 NAT, similar to IPv4

aptalca · 16 February 2021 15:01

Swarm is pretty much dead at this point (docker enterprise was bought by a k8s centric company). And k8s is way overkill for a homelab in my opinion.

We recommend docker-compose for a homelab (or even production scenarios that are only a couple of containers like an nginx+mariadb type setups).