Postresql RDS Instance Supported?

What happened?

When trying to use a Postgresql managed by Amazon RDS as the databroker database (Aurora or standard Postgres), we are unable to connect to the database.

What did you expect to happen?

Expected databroker to connect to the database so that requests could be proxied.

How’d it happen?

  1. Deploy AWS Postgresql RDS Instance (Any type)
  2. Deploy AWS Fargate task that includes a pomerium container

What’s your environment like?

  • Pomerium version (retrieve with pomerium --version): docker container pomerium/pomerium:latest
  • Server Operating System/Architecture/Cloud: AWS Fargate, AWS RDS (Aurora or Postgresql)

What’s your config.yaml?

Technically none, but here are the environment variables:

CERTIFICATE=<REDACTED>
CERTIFICATE_KEY=<REDACTED>
COOKIE_DOMAIN=pomerium.io 
COOKIE_SECRET=<REDACTED> 
DATABROKER_STORAGE_CONNECTION_STRING=postgresql://<REDACTED>@<REDACTED>.<REGION>.rds.amazonaws.com:5432/pomerium?target_session_attrs=read-write&sslmode=disable
DATABROKER_STORAGE_TLS_SKIP_VERIFY=true 
DATABROKER_STORAGE_TYPE=postgres 
EXPIRATION=24h
HTTP_REDIRECT_ADDR=:80 
IDP_CLIENT_ID=<REDACTED> 
IDP_CLIENT_SECRET=<REDACTED> 
IDP_PROVIDER=oidc
IDP_PROVIDER_URL=<REDACTED> 
IDP_SCOPES=<REDACTED>
METRICS_ADDRESS=:8080
ROUTES=<REDACTED>
SHARED_SECRET=<REDACTED>
SIGNING_KEY=<REDACTED>

What did you see in the logs?

{
    "level": "error",
    "syncer_id": "databroker",
    "syncer_type": "type.googleapis.com/pomerium.config.Config",
    "error": "error during initial sync: error receiving record: rpc error: code = DeadlineExceeded desc = failed to connect to `host=<REDACTED>.rds.amazonaws.com user=<REDACTED> database=pomerium`: dial error (timeout: dial tcp 10.0.11.100:5432: i/o timeout)",
    "time": "2022-11-21T12:53:04Z",
    "message": "sync"
}
{
    "level": "info",
    "syncer_id": "databroker",
    "syncer_type": "type.googleapis.com/pomerium.config.Config",
    "time": "2022-11-21T12:54:33Z",
    "message": "initial sync"
}
{
    "level": "info",
    "type": "type.googleapis.com/pomerium.config.Config",
    "time": "2022-11-21T12:54:33Z",
    "message": "sync latest"
}
{
    "level": "error",
    "config_file_source": "/pomerium/config.yaml",
    "bootstrap": true,
    "error": "rpc error: code = DeadlineExceeded desc = failed to connect to `host=<REDACTED>.rds.amazonaws.com user=<REDACTED> database=pomerium`: dial error (timeout: dial tcp 10.0.11.100:5432: i/o timeout)",
    "time": "2022-11-21T12:55:04Z",
    "message": "controlplane: error storing configuration event, retrying"
}
{
    "level": "warn",
    "config_file_source": "/pomerium/config.yaml",
    "bootstrap": true,
    "error": "rpc error: code = DeadlineExceeded desc = failed to connect to `host=<REDACTED>.rds.amazonaws.com user=<REDACTED> database=pomerium`: dial error (timeout: dial tcp 10.0.11.100:5432: i/o timeout)",
    "lease_name": "identity_manager",
    "time": "2022-11-21T12:57:04Z",
    "message": "leaser: error acquiring lease"
}
{
    "level": "info",
    "name": "identity_manager",
    "duration": 30000,
    "time": "2022-11-21T12:58:14Z",
    "message": "acquire lease"
}
{
    "level": "error",
    "error": "failed to connect to `host=<REDACTED>.rds.amazonaws.com user=<REDACTED> database=pomerium`: dial error (timeout: dial tcp 10.0.11.100:5432: i/o timeout)",
    "time": "2022-11-21T12:59:04Z",
    "message": "storage/postgres"
}
{
    "level": "error",
    "error": "failed to connect to `host=<REDACTED>.rds.amazonaws.com user=<REDACTED> database=pomerium`: dial error (timeout: dial tcp 10.0.11.100:5432: i/o timeout)",
    "time": "2022-11-21T13:01:04Z",
    "message": "storage/postgres"
}

Additional context

pgAdmin can connect successfully from another Fargate task with otherwise identical permissions (Security Groups, Subnets, Task and Execution Roles, etc.), so there is no reachability issue regarding the database related to subnet or security group settings in AWS.

Additionally, the enterprise console (which is sitting behind the pomerium proxy) is capable of connecting to the database as part of the same task configuration (i.e. security permissions for both containers are identical), so it could be a connection string issue, but when I replicate the connection string that the console uses, I receive the following error:

{
    "level": "error",
    "error": "cannot parse `pgsql://<REDACTED>@<REDACTED>.<REGION>.rds.amazonaws.com:5432/pomerium`: failed to parse as DSN (invalid dsn)",
    "time": "2022-12-08T13:51:10Z",
    "message": "storage/postgres"
}

The URI scheme designator can be either postgresql:// or postgres://

Using either of those schemas triggers the previous dial tcp error.

It looks like Fargate security settings are per-task, maybe if Console can connect to the RDS, but Core cannot, something is off wrt security settings?

the pomerium proxy container that can’t connect and the console container are in the same task.