TomasPT
ZITADEL17mo ago
23 replies
TomasP

K8s deployment DB issues

Today we noticed some error logs in our k8s deployment of Zitadel, related to DB access and can't figure out what's wrong. We have two separate k8s deployments: one with in-cluster deployed postgres and another with a managed postgres solution and both deployments produce the same errors. This is a sample of the errors:
time="2024-09-05T04:48:51Z" 
    level=info 
    msg="process events failed" 
    caller="/home/runner/work/zitadel/zitadel/internal/eventstore/handler/v2/handler.go:413" 
    error="context deadline exceeded" 
    projection=projections.project_grant_members4

time="2024-09-05T04:48:53Z" 
    level=info 
    msg="process events failed" 
    caller="/home/runner/work/zitadel/zitadel/internal/eventstore/handler/v2/handler.go:413" 
    error="failed to connect to `host=auth-domain.postgres-system.svc.cluster.local user=auth_user database=auth_zitadel_prod`: 
        hostname resolving error (lookup auth-domain.postgres-system.svc.cluster.local on 172.16.0.3:53: dial udp 172.16.0.3:53: i/o timeout)" 
    projection=projections.secret_generators2

time="2024-09-05T04:48:53Z" 
    level=info 
    msg="process events failed" 
    caller="/home/runner/work/zitadel/zitadel/internal/eventstore/handler/v2/handler.go:413" 
    error="failed to connect to `host=auth-domain.postgres-system.svc.cluster.local user=auth_user database=auth_zitadel_prod`: 
        hostname resolving error (lookup auth-domain.postgres-system.svc.cluster.local on 172.16.0.3:53: dial udp 172.16.0.3:53: i/o timeout)" 
    projection=projections.project_grant_members4


We tried a python script to create many pg connections, but it didn't produce any connection issues, also postgres is not utilized like at all. Our application stack, that's running on the same k8s cluster as Zitadel uses postgres events directly (pg_notify) to track Zitadel event store, but I guess this should not affect anything. Local Docker stack doesn't have issues.
Any suggestions where should we look further?
Was this page helpful?