| From: | Rahila Syed <rahilasyed90(at)gmail(dot)com> |
|---|---|
| To: | Alexander Lakhin <exclusion(at)gmail(dot)com> |
| Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
| Subject: | Re: Error while processing invalidation message during ATTACH PARTITION leaves invalid relcache entry |
| Date: | 2026-06-16 04:42:36 |
| Message-ID: | CAH2L28tLwS+P===doCOpucOAQMvmfJFvtxfTMkSRgNx4Wsdcpw@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi Alexander,
Thank you for the report. This is an interesting case of incomplete or
incorrect error handling.
Regarding the code path in LocalExecuteInvalidationMessage:
(This can seem dubious, but I guess there could be other (perhaps more
sophisticated) ways to trigger an error somewhere inside
LocalExecuteInvalidationMessage() -> RelationCacheInvalidateEntry() ->
RelationFlushRelation() -> RelationRebuildRelation() ->
RelationBuildDesc() -> RelationBuildTupleDesc() -> systable_getnext()...)
I wonder if we should prevent adding CHECK_FOR_INTERRUPTS (CFI) calls
in this path. A quick search did not reveal any existing CFI calls
here. In your example, the CFI is triggered by the elog(LOG, "") added
to the code as part of your testing.
To prevent incomplete cache invalidation during an abort, we probably
need to avoid processing interrupts and ensure the process does not
error out. Otherwise, as you demonstrated, we risk leaving the
relcache in an inconsistent state where a stale entry remains even
after a transaction is rolled back.
Best regards,
Rahila Syed
| From | Date | Subject | |
|---|---|---|---|
| Next Message | cca5507 | 2026-06-16 04:46:18 | Re: [BUG] Take a long time to reach consistent after pg_rewind |
| Previous Message | Amit Kapila | 2026-06-16 04:35:51 | Re: Support EXCEPT for TABLES IN SCHEMA publications |