Recovering from detoast-related catcache invalidations

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Cc: Alexander Lakhin <exclusion(at)gmail(dot)com>
Subject: Recovering from detoast-related catcache invalidations
Date: 2023-10-26 20:43:33
Message-ID: 1393953.1698353013@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

In bug #18163 [1], Alexander proved the misgivings I had in [2]
about catcache detoasting being possibly unsafe:

>> BTW, while nosing around I found what seems like a very nasty related
>> bug. Suppose that a catalog tuple being loaded into syscache contains
>> some toasted fields. CatalogCacheCreateEntry will flatten the tuple,
>> involving fetches from toast tables that will certainly cause
>> AcceptInvalidationMessages calls. What if one of those should have
>> invalidated this tuple? We will not notice, because it's not in
>> the hashtable yet. When we do add it, we will mark it not-dead,
>> meaning that the stale entry looks fine and could persist for a long
>> while.

Attached is a POC patch for fixing this. The idea is that if we get
an invalidation while trying to detoast a catalog tuple, we should
go around and re-read the tuple a second time to get an up-to-date
version (or, possibly, discover that it no longer exists). In the
case of SearchCatCacheList, we have to drop and reload the entire
catcache list, but fortunately not a lot of new code is needed.

The detection of "get an invalidation" could be refined: what I did
here is to check for any advance of SharedInvalidMessageCounter,
which clearly will have a significant number of false positives.
However, the only way I see to make that a lot better is to
temporarily create a placeholder catcache entry (probably a negative
one) with the same keys, and then see if it got marked dead.
This seems a little expensive, plus I'm afraid that it'd be actively
wrong in the recursive-lookup cases that the existing comment in
SearchCatCacheMiss is talking about (that is, the catcache entry
might mislead any recursive lookup that happens).

Moreover, if we did do something like that then the new code
paths would be essentially untested. As the patch stands, the
reload path seems to get taken 10 to 20 times during a
"make installcheck-parallel" run of the core regression tests
(out of about 150 times that catcache detoasting is required).
Probably all of those are false-positive cases, but at least
they're exercising the logic.

So I'm inclined to leave it like this, but perhaps somebody
else will have a different opinion.

(BTW, there's a fair amount of existing catcache.c code that
will need to be indented another tab stop, but in the interests
of keeping the patch legible I didn't do that yet.)

Comments?

regards, tom lane

[1] https://www.postgresql.org/message-id/18163-859bad19a43edcf6%40postgresql.org
[2] https://www.postgresql.org/message-id/1389919.1697144487%40sss.pgh.pa.us

Attachment Content-Type Size
v1-fix-stale-catcache-entries.patch text/x-diff 7.3 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nikita Malakhov 2023-10-26 20:56:12 Re: RFC: Pluggable TOAST
Previous Message Alena Rybakina 2023-10-26 20:41:54 Re: POC, WIP: OR-clause support for indexes