Re: BUG #17268: Possible corruption in toast index after reindex index concurrently

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Alexey Ermakov <alexey(dot)ermakov(at)dataegret(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Maxim Boguk <maxim(dot)boguk(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>, Peter Geoghegan <pg(at)bowt(dot)ie>
Subject: Re: BUG #17268: Possible corruption in toast index after reindex index concurrently
Date: 2021-11-15 11:03:04
Message-ID: YZI+aNEnnpBASxNU@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Thu, Nov 11, 2021 at 06:09:49PM +0900, Michael Paquier wrote:
> To be clear on this point, users cannot reindex concurrently catalog
> indexes and toast indexes associated to catalog tables, just toast
> indexes of normal tables. I don't know if any of you have been
> working on a patch, but I was cooking something. It would be worth
> checking if an isolation test could be written.

So, I have worked on this one. And attached is a patch that
implements the two approaches suggested by Andres which are able to
fix the issues discussed:
1) Switch to a session-level lock on the parent relation if doing a
reindex concurrently on a toast table or on one of its indexes. This
requires to look back at pg_class.reltoastrelid to find the correct
parent. This stresses me quite a bit, and I am not sure that I like
that to be honest because we don't do anything like that in the rest
of the tree. I am also getting the feeling that this is an open door
for more issues.
2) Don't release locks when a new toast value is saved until the end
of its transaction.

After more testing, I have been able to extract and write an isolation
test that is able to reproduce the failure. It relies on a trick as
the toast relation names are not deterministic, and we cannot use
REINDEX CONCURRENTLY in a function context. So I have used an ALTER
TABLE/INDEX RENAME with a DO block to change the toast relation
names with allow_system_table_mods instead. There is no need either
for amcheck with this method.

2) is enough to fix the problem, and I'd like to think that we had
better stick with only this method for simplicity's sake.

Comments?
--
Michael

Attachment Content-Type Size
v1-0001-Fix-locking-of-toast-relations-with-REINDEX-CONC.patch text/x-diff 12.1 KB

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2021-11-15 15:29:45 Bogus NULL object_name from pg_event_trigger_dropped_objects()
Previous Message Dave Page 2021-11-15 10:12:59 Re: Tenable Report Issue even after upgrading to correct Postgres version