rud
rud19h ago

Staggering log volume when database is unavailable

We had service on a database the other day, and Zitadel is very enthusiastic about reporting errors, either DNS lookup failure or connection errors. Here I reproduced the behaviour with a 4.1.3 install running in Docker Compose, and I stopped the db instance, so DNS resolution would fail. These logs are from the Zitadel container:
time=2025-09-11T09:33:17.066Z level=ERROR msg="producer: Error fetching queue settings" err="failed to connect to `user=zitadel_user database=zitadel`: hostname resolving error: lookup db on 127.0.0.11:53: no such host"

time=2025-09-11T09:33:17.539Z level=ERROR msg="producer: Error fetching jobs" err="failed to connect to `user=zitadel_user database=zitadel`: hostname resolving error: lookup db on 127.0.0.11:53: no such host" queue=execution

time=2025-09-11T09:33:17.539Z level=ERROR msg="producer: Error fetching jobs" err="failed to connect to `user=zitadel_user database=zitadel`: hostname resolving error: lookup db on 127.0.0.11:53: no such host"
time=2025-09-11T09:33:17.066Z level=ERROR msg="producer: Error fetching queue settings" err="failed to connect to `user=zitadel_user database=zitadel`: hostname resolving error: lookup db on 127.0.0.11:53: no such host"

time=2025-09-11T09:33:17.539Z level=ERROR msg="producer: Error fetching jobs" err="failed to connect to `user=zitadel_user database=zitadel`: hostname resolving error: lookup db on 127.0.0.11:53: no such host" queue=execution

time=2025-09-11T09:33:17.539Z level=ERROR msg="producer: Error fetching jobs" err="failed to connect to `user=zitadel_user database=zitadel`: hostname resolving error: lookup db on 127.0.0.11:53: no such host"
We're seeing about 1mio log-lines per minute, which I think is risky behaviour for stability. Would it at all be possible to add some throttling to the connection retry mechanism? It seems like it might be a bit too eager, at least to me.
2 Replies
Ask Inkeep
Ask Inkeep19h ago
Hi @rud, I'm an AI assistant for ZITADEL. While a team member reviews this question, I'll see if I can find relevant sources for your question. Answer I wasn't able to find a direct answer, here are some relevant sources: 1. failed to connect to db : This discussion shows similar database connection issues during ZITADEL setup and startup, though it doesn't address throttling mechanisms. 2. error logs : This thread discusses database connectivity issues and 504 errors, but doesn't cover retry throttling behavior. The available sources don't contain information about configuring throttling for database connection retries. For this specific feature request about reducing the aggressive retry behavior and log volume, you may want to click the "Ask for help" button to tag a team member who can provide guidance on existing configuration options or consider this as a feature enhancement. Feel free to tag @Ask Inkeep with additional questions. Got it. Tagging @Rajat for additional help.
rud
rudOP19h ago
Hi again, @Rajat , hope this Thursday finds you well. I wanted to post it here first, but I think it might be relevant as a bug report to the code. What do you think?

Did you find this page helpful?