Ozzzkar
Ozzzkar4w ago

External IdP logins fail with multiple Zitadel replicas

Using Zitadel 4.4.0 deployed from the Zitadel Helm chart 9.8.0, external IdP logins fail when we run multiple replicas of the Zitadel deployment (2 pods), we get "an internal error occurred". When only using 1 pod, it works. When the failure happens, we don't find any errors in the logs, I checked both stdout (we configured it to be json format), the events and failed events tabs in the default organization in Zitadel's admin console. Anyone knows how to debug this?
No description
13 Replies
Ozzzkar
OzzzkarOP4w ago
Now I finally found some logs, they are in zitadel-login and not zitadel. In one environment I get Error [ConnectError]: [not_found] Auth Request does not exist (QUERY-Thee9) and in the other, I get Intent has not succeeded (IDP-nme4gszsvx)
No description
Ozzzkar
OzzzkarOP4w ago
when I get Auth Request does not exist, I get error 500 on the whole zitadel-login front page, the Intent has not succeeded is in the IdP auth flow and that's the one that shows the yellow error message "An internal error occurred" on the page I can also see Error [ConnectError]: [not_found] Session does not exist (QUERY-SFeaa) probably a race condition, zitadel-login creates a session in zitadel and then zitadel-login requests the other zitadel replica for the session but it hasn't synced to the db yet so it returns "not found"?
Ozzzkar
OzzzkarOP4w ago
No description
No description
Arnau
Arnau3w ago
Hello, I just faced the same issue with 2 instances. I have a K8S namespace with Zitadel v4.4.0 and Helm release 9.12.1, latest versions as of writing this. I had configured 1 single replica, and the login app was working nicely, in our case with SAML IdPs. Then I enabled 2 replicas, and started facing the same issue as @Ozzzkar.
No description
Ozzzkar
OzzzkarOP3w ago
i'm filing a github issue right now
Ozzzkar
OzzzkarOP3w ago
GitHub
[Bug]: Race conditions when running multiple Zitadel replicas · Is...
Preflight Checklist I could not find a solution in the documentation, the existing issues or discussions I have joined the ZITADEL chat Environment Self-hosted Version 4.4.0 Database PostgreSQL Dat...
Arnau
Arnau3w ago
CC@Rajat @Federico @elina_shoko
elina_shoko
elina_shoko3w ago
Heya, thanks for reporting, we're on it 🙏
Arnau
Arnau3w ago
Hi there, we appreciate the prioritization on this one. Any new insights? This race condition is blocking our Zitadel v4 upgrade path - we can't proceed even to nonprod envs without a fix for this. Happy to test RCs if available. Thanks for your understanding.
Rajat
Rajat2w ago
hey @Arnau thanks for the heads up, I will check internally as to what's the status of it and get back to you. hey @Arnau I think its still a work in progress as I can see its been worked on. But an exact timeline only @elina_shoko can tell.
elina_shoko
elina_shoko2w ago
heya @Ozzzkar @Arnau happy Friday, the fix is out :gigilove:
Arnau
Arnau7d ago
I'll test 4.6.2 monday first thing 🙂 thank you! Tested OK with 4.6.2, I was not able to reproduce the issue logging in and out with a SAML IdP. Thanks!
Ozzzkar
OzzzkarOP7d ago
thanks, I will also test again soon

Did you find this page helpful?