Intermittent dramatic degradation in event store PostgreSQL query
Use-case: Hi, we're using Zitadel for identity management. (I am new to Zitadel, so if that's not specific enough, I apologize, I don't know the lingo.)
Environment: Self Hosting
Version: Not sure
Stack: PostgreSQL is I think the only relevant thing
What you expected to happen: Event store query is instantaneous
What went wrong: Event store query suddenly puts significant CPU load on database and causes latency
Wondering whether folks have seen this? This query, at runtime, gets like 30x more expensive in terms of CPU, and then drops back down to 1/30. This is on a time scale of, like, days. It's bad for days, then good for days. It doesn't seem like we're doing anything. It's an AWS RDS, and I suspect maybe an autovacuum is messing up the query plan?
Just wanted to know if it rings bells, and if so, what folks did.
Environment: Self Hosting
Version: Not sure
Stack: PostgreSQL is I think the only relevant thing
What you expected to happen: Event store query is instantaneous
What went wrong: Event store query suddenly puts significant CPU load on database and causes latency
Wondering whether folks have seen this? This query, at runtime, gets like 30x more expensive in terms of CPU, and then drops back down to 1/30. This is on a time scale of, like, days. It's bad for days, then good for days. It doesn't seem like we're doing anything. It's an AWS RDS, and I suspect maybe an autovacuum is messing up the query plan?
Just wanted to know if it rings bells, and if so, what folks did.