|From:||Michael Paquier <michael(at)paquier(dot)xyz>|
|To:||Noah Misch <noah(at)leadboat(dot)com>|
|Cc:||Andres Freund <andres(at)anarazel(dot)de>, Alexey Ermakov <alexey(dot)ermakov(at)dataegret(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Maxim Boguk <maxim(dot)boguk(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>, Peter Geoghegan <pg(at)bowt(dot)ie>|
|Subject:||Re: BUG #17268: Possible corruption in toast index after reindex index concurrently|
|Views:||Raw Message | Whole Thread | Download mbox | Resend email|
On Thu, Nov 11, 2021 at 06:09:49PM +0900, Michael Paquier wrote:
> To be clear on this point, users cannot reindex concurrently catalog
> indexes and toast indexes associated to catalog tables, just toast
> indexes of normal tables. I don't know if any of you have been
> working on a patch, but I was cooking something. It would be worth
> checking if an isolation test could be written.
So, I have worked on this one. And attached is a patch that
implements the two approaches suggested by Andres which are able to
fix the issues discussed:
1) Switch to a session-level lock on the parent relation if doing a
reindex concurrently on a toast table or on one of its indexes. This
requires to look back at pg_class.reltoastrelid to find the correct
parent. This stresses me quite a bit, and I am not sure that I like
that to be honest because we don't do anything like that in the rest
of the tree. I am also getting the feeling that this is an open door
for more issues.
2) Don't release locks when a new toast value is saved until the end
of its transaction.
After more testing, I have been able to extract and write an isolation
test that is able to reproduce the failure. It relies on a trick as
the toast relation names are not deterministic, and we cannot use
REINDEX CONCURRENTLY in a function context. So I have used an ALTER
TABLE/INDEX RENAME with a DO block to change the toast relation
names with allow_system_table_mods instead. There is no need either
for amcheck with this method.
2) is enough to fix the problem, and I'd like to think that we had
better stick with only this method for simplicity's sake.
|Next Message||Tom Lane||2021-11-15 15:29:45||Bogus NULL object_name from pg_event_trigger_dropped_objects()|
|Previous Message||Dave Page||2021-11-15 10:12:59||Re: Tenable Report Issue even after upgrading to correct Postgres version|