What happened?
We upgraded from 0.25.2 to 0.29.2 last month. Since the upgrade, Pomerium hasn’t been able to renew certificates.
With this configuration:
autocert: true
autocert_ca: https://acme.zerossl.com/v2/DV90
autocert_eab_key_id: ...
autocert_eab_mac_key: ...
I see a few different things in the logs. The first looks like a ZeroSSL renewal error, that the certificate has already been renewed, as if something has gotten out of sync:
2025-06-18T04:15:38.518146+00:00 pomserver pomerium[772354]: {"level":"error","time":"2025-06-18T04:15:38Z","logger":"renew","msg":"will retry","service":"autocert","error":"[exampleroute1.example.com] Renew: [exampleroute1.example.com] creating new order: attempt 1: https://acme.zerossl.com/v2/DV90/newOrder: HTTP 409 urn:ietf:params:acme:error:alreadyReplaced - The \"replaces\" field identifies a certificate that has already been replaced by a different order (ca=https://acme.zerossl.com/v2/DV90)","attempt":1,"retrying_in":60,"elapsed":8.582434196,"max_duration":2592000}
I tried removing one of the directories from /etc/pomerium/certificates/acme.zerossl.com-v2-dv90 and restarting to force it to get a new certificate instead of a renewal, but negotiation failed due to not being able to find a compatible challenge. At the time this message was logged, I don’t think pomerium was listening yet:
2025-06-18T04:08:20.698721+00:00 pomserver pomerium[772354]: {"level":"error","time":"2025-06-18T04:08:20Z","logger":"obtain","msg":"could not get certificate from issuer","service":"autocert","identifier":"exampleroute2.example.com","issuer":"acme.zerossl.com-v2-DV90","error":"[exampleroute2.example.com] solving challenges: exampleroute2.example.com: no solvers available for remaining challenges (configured=[tls-alpn-01] offered=[http-01 dns-01] remaining=[http-01 dns-01]) (order=https://acme.zerossl.com/v2/DV90/order/xxxordernumberexample (ca=https://acme.zerossl.com/v2/DV90)"}
It tried again later but seems to have timed out:
2025-06-18T05:19:09.009477+00:00 pomserver pomerium[772354]: {"level":"error","time":"2025-06-18T05
:19:09Z","logger":"obtain","msg":"could not get certificate from issuer","service":"autocert","identifier":"exampleroute2.example.com","issuer":"acme.zerossl.com-v2-DV90","error":"[exampleroute2.example.com] creating new order: fetching new nonce from server: context deadline exceeded (ca=https://acme.zerossl.com/v2/DV90)"}
After a while it looks like it gets confused and tries to use letsencrypt’s staging URL instead of zerossl. This part isn’t new in 0.29, I used to see it in 0.25.2 from time to time, and after a restart it starts talking to zerossl again.
2025-06-18T11:16:04.419398+00:00 pomserver pomerium[772354]: {"level":"error","time":"2025-06-18T11
:16:04Z","logger":"renew","msg":"will retry","service":"autocert","error":"[exampleroute3.example.com] Renew: provisioning client: performing request: Get \"https://acme-staging-v02.api.letsencrypt.org/directory\": Forbidden","attempt":20,"retrying_in":3600,"elapsed":25234.482169559,"max_duration":2592000}
What’s your environment like?
pomerium version pomerium: 0.29.4+2c9dcfc2
envoy: 1.32.3+1aa0efe687b20e1c823962b90ba3e9a6d839b37ef3a60e080519070399b9752e
- Ubuntu 24.04 in Azure
What’s your config.yaml?
address: :443
authenticate_service_url: https://example.com
autocert: true
autocert_ca: https://acme.zerossl.com/v2/DV90
autocert_eab_key_id: ...
autocert_eab_mac_key: ...
shared_secret: ...
cookie_secret: ...
idp_provider: "azure"
...
routes:
- from: https://exampleroute.example.com
to: https://exampleroute.real.host:8443
preserve_host_header: true
set_request_headers:
X-Forwarded-Port: 443
policy:
- allow:
or:
- claim/groups: "xxx1"
Additional context
The setup had been working for a few years. I was trying to remember why we went with ZeroSSL: I think because of the Let’s Encrypt rate limit, and ZeroSSL had also recently been made the default for acme.sh
.
I could try switching to Let’s Encrypt, but I can’t tell from the logs if the problem is with Pomerium or maybe just ZeroSSL’s API being slow. The downtime from restarting and replacing all certs would be fairly long, so I wanted to check in here before trying it blindly.
Pomerium is using a proxy to connect out to the Internet, and this seems to be working, or had been working, anyway. I don’t see any tcp connections stuck in SYN_SENT that would indicate it’s trying to go direct when it’s not supposed to.
The proxy is configured in /etc/systemd/system/pomerium.service.d/proxy.conf:
[Service]
EnvironmentFile=-/etc/proxy.conf
/etc/proxy.conf:
http_proxy="http://proxy.example.com:3128"
https_proxy="http://proxy.example.com:3128"
ftp_proxy="http://proxy.example.com:3128"
no_proxy="localhost,.local,.example.com,169.254.169.254,management.azure.com,..."