Self hosted authenticate service

What happened?

We used the hosted authenticate service with success but later decided to host our own to have uptime guarantee.

We currently use version v0.22.1.

We now get a 401 locally by our pomerium proxy after a few minutes and I can see the following in the logs:

{"level":"info","config_file_source":"/etc/pomerium/config/config.yaml","bootstrap":true,"service":"identity_manager","user_id":"107203644295945440985","session_id":"94ecd878-53ad-49c8-b0bc-b54500772a97","time":"2024-04-03T12:27:32Z","message":"refreshing session"}
{"level":"info","config_file_source":"/etc/pomerium/config/config.yaml","bootstrap":true,"service":"identity_manager","user_id":"107203644295945440985","session_id":"94ecd878-53ad-49c8-b0bc-b54500772a97","time":"2024-04-03T12:27:32Z","message":"no authenticator defined, deleting session"}

After that our session is cleared and we get redirected automatically to our authentication service (our frontend logic does that for us), which still has the cookie so get redirected back to our local pomerium proxy and the session is recreated without having to actually log in again, but this is still a problem because of the redirects.

I’ve realized we might have to set a shared_secret between our proxy and the authenticate service, so I did although there were nothing related in the logs.

Now it “seems” to work better, at least I don’t get the session delete and 401 for a while initially, but then it starts to do that again after like an hour or so. It’s not entirely deterministic for me how and when that happens.

What did you expect to happen?

I expected the authenticate service and our proxy behave the same way as previously.

How’d it happen?

I tried to describe above

What’s your environment like?

  • Pomerium version (retrieve with pomerium --version): v0.22.1
  • Server Operating System/Architecture/Cloud: kubernetes (minikube/gke)

What’s your config.yaml?

This is our local pomerium deployment config without the routes

address: :8000
metrics_address: :8010
autocert: false
log_level: "info"
shared_secret: X
authenticate_service_url: "https://authenticate.our.domain"
certificates:
  - cert: /etc/pomerium/certificates/combined/tls.crt
    key: /etc/pomerium/certificates/combined/tls.key
routes:
  ...

Our full authenticate service config:

    address: :8000
    metrics_address: :8010
    autocert: false
    log_level: "info"
    authenticate_service_url: "https://authenticate.our.domain"
    idp_provider: google
    shared_secret: X
    certificates:
      - cert: /etc/pomerium/certificates/combined/tls.crt
        key: /etc/pomerium/certificates/combined/tls.key

What did you see in the logs?

{"level":"info","type":"type.googleapis.com/session.Session","id":"94ecd878-53ad-49c8-b0bc-b54500772a97","time":"2024-04-03T12:28:12Z","message":"get"}
{"level":"info","type":"type.googleapis.com/session.Session","query":"","offset":0,"limit":1,"filter":{"$or":[{"id":"94ecd878-53ad-49c8-b0bc-b54500772a97"},{"$index":"94ecd878-53ad-49c8-b0bc-b54500772a97"}]},"time":"2024-04-03T12:28:17Z","message":"query"}
{"level":"warn","error":"record not found","time":"2024-04-03T12:28:17Z","message":"clearing session due to missing session or service account"}
{"level":"warn","error":"record not found","time":"2024-04-03T12:28:17Z","message":"clearing session due to missing session or service account"}
{"level":"error","error":"Unauthorized","status":401,"status-text":"Unauthorized","request-id":"9f52cfab-bb0c-4ce5-ba27-9634d7e2c68d","time":"2024-04-03T12:28:17Z","message":"httputil: error"}
{"level":"error","error":"Unauthorized","status":401,"status-text":"Unauthorized","request-id":"9db7abcf-fb53-4dc9-a024-700bdd9bce84","time":"2024-04-03T12:28:17Z","message":"httputil: error"}
{"level":"warn","error":"record not found","time":"2024-04-03T12:28:17Z","message":"clearing session due to missing session or service account"}
{"level":"error","error":"hpke: error requesting hpke-public-key endpoint: Get \"https://authenticate.our.domain/.well-known/pomerium/hpke-public-key\": context canceled","request-id":"c804a771-e9b8-401d-aede-ca94959d1779","time":"2024-04-03T12:28:17Z","message":"grpc check ext_authz_error"}
{"level":"warn","error":"record not found","time":"2024-04-03T12:28:17Z","message":"clearing session due to missing session or service account"}
{"level":"warn","error":"record not found","time":"2024-04-03T12:28:17Z","message":"clearing session due to missing session or service account"}
{"level":"info","type":"type.googleapis.com/session.Session","id":"f5d86e4d-747e-4d14-8211-a7acae3d3bcb","time":"2024-04-03T12:28:17Z","message":"get"}
{"level":"info","type":"type.googleapis.com/user.User","id":"107203644295945440985","time":"2024-04-03T12:28:17Z","message":"get"}
{"level":"info","record-count":2,"record-type":"type.googleapis.com/user.User","time":"2024-04-03T12:28:17Z","message":"put"}
{"level":"info","type":"type.googleapis.com/session.Session","query":"","offset":0,"limit":1,"filter":{"$or":[{"id":"f5d86e4d-747e-4d14-8211-a7acae3d3bcb"},{"$index":"f5d86e4d-747e-4d14-8211-a7acae3d3bcb"}]},"time":"2024-04-03T12:28:17Z","message":"query"}
{"level":"info","type":"type.googleapis.com/session.Session","query":"","offset":0,"limit":1,"filter":{"$or":[{"id":"f5d86e4d-747e-4d14-8211-a7acae3d3bcb"},{"$index":"f5d86e4d-747e-4d14-8211-a7acae3d3bcb"}]},"time":"2024-04-03T12:28:17Z","message":"query"}
{"level":"info","type":"type.googleapis.com/user.ServiceAccount","query":"","offset":0,"limit":1,"filter":{"$or":[{"id":"f5d86e4d-747e-4d14-8211-a7acae3d3bcb"},{"$index":"f5d86e4d-747e-4d14-8211-a7acae3d3bcb"}]},"time":"2024-04-03T12:28:17Z","message":"query"}
{"level":"info","type":"type.googleapis.com/session.Session","query":"","offset":0,"limit":1,"filter":{"$or":[{"id":"f5d86e4d-747e-4d14-8211-a7acae3d3bcb"},{"$index":"f5d86e4d-747e-4d14-8211-a7acae3d3bcb"}]},"time":"2024-04-03T12:28:17Z","message":"query"}
{"level":"info","type":"type.googleapis.com/session.Session","query":"","offset":0,"limit":1,"filter":{"$or":[{"id":"f5d86e4d-747e-4d14-8211-a7acae3d3bcb"},{"$index":"f5d86e4d-747e-4d14-8211-a7acae3d3bcb"}]},"time":"2024-04-03T12:28:17Z","message":"query"}
{"level":"info","type":"type.googleapis.com/session.Session","query":"","offset":0,"limit":1,"filter":{"$or":[{"id":"f5d86e4d-747e-4d14-8211-a7acae3d3bcb"},{"$index":"f5d86e4d-747e-4d14-8211-a7acae3d3bcb"}]},"time":"2024-04-03T12:28:17Z","message":"query"}
{"level":"info","type":"type.googleapis.com/session.Session","query":"","offset":0,"limit":1,"filter":{"$or":[{"id":"f5d86e4d-747e-4d14-8211-a7acae3d3bcb"},{"$index":"f5d86e4d-747e-4d14-8211-a7acae3d3bcb"}]},"time":"2024-04-03T12:28:17Z","message":"query"}
{"level":"info","type":"type.googleapis.com/session.Session","query":"","offset":0,"limit":1,"filter":{"$or":[{"id":"f5d86e4d-747e-4d14-8211-a7acae3d3bcb"},{"$index":"f5d86e4d-747e-4d14-8211-a7acae3d3bcb"}]},"time":"2024-04-03T12:28:17Z","message":"query"}
{"level":"info","type":"type.googleapis.com/session.Session","query":"","offset":0,"limit":1,"filter":{"$or":[{"id":"f5d86e4d-747e-4d14-8211-a7acae3d3bcb"},{"$index":"f5d86e4d-747e-4d14-8211-a7acae3d3bcb"}]},"time":"2024-04-03T12:28:17Z","message":"query"}
{"level":"info","type":"type.googleapis.com/session.Session","query":"","offset":0,"limit":1,"filter":{"$or":[{"id":"f5d86e4d-747e-4d14-8211-a7acae3d3bcb"},{"$index":"f5d86e4d-747e-4d14-8211-a7acae3d3bcb"}]},"time":"2024-04-03T12:28:17Z","message":"query"}
{"level":"info","type":"type.googleapis.com/session.Session","query":"","offset":0,"limit":1,"filter":{"$or":[{"id":"f5d86e4d-747e-4d14-8211-a7acae3d3bcb"},{"$index":"f5d86e4d-747e-4d14-8211-a7acae3d3bcb"}]},"time":"2024-04-03T12:28:17Z","message":"query"}
{"level":"info","type":"type.googleapis.com/session.Session","query":"","offset":0,"limit":1,"filter":{"$or":[{"id":"f5d86e4d-747e-4d14-8211-a7acae3d3bcb"},{"$index":"f5d86e4d-747e-4d14-8211-a7acae3d3bcb"}]},"time":"2024-04-03T12:28:17Z","message":"query"}
{"level":"info","service":"envoy","upstream-cluster":"","method":"GET","authority":"ui.axoflow.garden","path":"/api/v1/host-metrics","user-agent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/123.0.0.0 Safari/537.36","referer":"https://ui.axoflow.garden/","forwarded-for":"10.244.0.4","request-id":"9f52cfab-bb0c-4ce5-ba27-9634d7e2c68d","duration":20.198708,"size":465,"response-code":401,"response-code-details":"ext_authz_denied","time":"2024-04-03T12:28:17Z","message":"http-request"}
{"level":"info","service":"envoy","upstream-cluster":"","method":"GET","authority":"ui.axoflow.garden","path":"/api/v1/hosts","user-agent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/123.0.0.0 Safari/537.36","referer":"https://ui.axoflow.garden/","forwarded-for":"10.244.0.4","request-id":"9db7abcf-fb53-4dc9-a024-700bdd9bce84","duration":22.092334,"size":465,"response-code":401,"response-code-details":"ext_authz_denied","time":"2024-04-03T12:28:17Z","message":"http-request"}
{"level":"info","service":"envoy","upstream-cluster":"","method":"GET","authority":"ui.axoflow.garden","path":"/signin","user-agent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/123.0.0.0 Safari/537.36","referer":"https://ui.axoflow.garden/","forwarded-for":"10.244.0.4","request-id":"c804a771-e9b8-401d-aede-ca94959d1779","duration":0,"size":0,"response-code":0,"response-code-details":"http2.remote_reset","time":"2024-04-03T12:28:17Z","message":"http-request"}

Additional context

Add any other context about the problem here.

This happens in the identity manager, which is run by the databroker service. Without IDP credentials the identity manager will not be able to refresh user sessions. When session refresh fails the identity manager deletes the session and then subsequent authorization requests initiate a redirect back to the authenticate service.

The databroker needs IDP credentials to properly refresh user sessions.

This is also an issue in hosted authenticate but we purposefully set the session expiration to be high in AWS cognito to avoid needing to refresh those sessions. The defaults for Google are likely much lower, leading to users getting logged out sooner.

Thanks it makes sense! However I havent seen where that expiration is set exactly. Also I dont completely understand why the authentication service doesn’t have to log me in again, is that controlled by a different expiration between that and the idp?

Hi @pepov, there are a few different expiration times to be aware of.

Upon successful sign-in to a Pomerium route, Pomerium receives an access token and refresh token from the IdP. These tokens are stored as part of the Pomerium session (in the databroker storage backend). Both of these have some expiration time, set by the IdP. If you’re using Google as your identity provider, I believe the access tokens will be valid for 1 hour. (I’m not sure whether this is customizable.) The expiration time for a refresh token will generally be much longer (many days).

There is also the Pomerium cookie_expiration setting, which controls the maximum lifetime of a Pomerium session.

If this maximum Pomerium session lifetime is longer than the access token validity, then Pomerium will attempt to refresh the access token with your IdP, using the corresponding refresh token. Currently this is scheduled to happen when an access token is within 1 minute of expiring. This way, Pomerium will find out if the underlying IdP session has been revoked.

I hope that’s helpful. Please let us know if you have any other questions.

And as a side comment, we generally encourage running Pomerium in all-in-one mode, rather than in split-service mode. I think you may find it simpler to configure Pomerium if you have one single configuration file.

Thanks,
Ken

Hey, thanks for the detailed answer!

This all makes sense now and I can verify that this is the case. The problem (at least in our case) comes after that one hour, when the google access token expires. At that point we get redirected to the authentication service, but the authentication service doesn’t reauthenticate us, but eventually we get redirected to our proxy and get the same access token that is already expired. With 0.22.1 this works for a minute or two, but then in the background the session refresh happens again, and we get redirected. Then this goes on forever.

I didn’t have success upgrading to the latest version, because we get into a redirect loop there instantly and I didn’t have time to debug properly.

Please note, that we decided to move away from this approach. We will install pomerium all-in-one and will connect IdPs directly.

OK, thanks for letting us know!