Re: BUG #17268: Possible corruption in toast index after reindex index concurrently

From: "Bossart, Nathan" <bossartn(at)amazon(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>, Noah Misch <noah(at)leadboat(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Alexey Ermakov <alexey(dot)ermakov(at)dataegret(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, "Maxim Boguk" <maxim(dot)boguk(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>, Peter Geoghegan <pg(at)bowt(dot)ie>
Subject: Re: BUG #17268: Possible corruption in toast index after reindex index concurrently
Date: 2021-12-07 22:04:54
Message-ID: BC0B973C-AD3A-4163-866D-9AF8F2226174@amazon.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On 12/5/21, 5:54 PM, "Michael Paquier" <michael(at)paquier(dot)xyz> wrote:
> I have been working on this one again for the last couple of days, and
> I would still go with the simple solution, releasing the row-level
> toast locks only at the end of the transaction when saving and
> deleting a toast value. While testing, I have noticed that the
> deletion part is also important, as a REINDEX CONCURRENTLY run in
> parallel of a transaction removing a toast value would go through
> without that, rather than waiting for the transaction doing the
> deletion to commit first. I have expanded the tests with everything I
> could think about, so I'd like to commit the attached. The test is
> fancy with its use of allow_system_table_mods to make the toast
> relation names reliable, but it has been really useful.

I confirmed that the new tests reliably produce corruption and that
the suggested fix resolves it. I also lean towards the simple
solution, but I do wonder if it creates any interesting side effects.
For example, could holding the locks longer impact performance? (Of
course, performance is probably not a great reason to avoid a
sustainable solution.)

- toast_close_indexes(toastidxs, num_indexes, RowExclusiveLock);
- table_close(toastrel, RowExclusiveLock);
+ toast_close_indexes(toastidxs, num_indexes, NoLock);
+ table_close(toastrel, NoLock);

I think it would be good to expand the comments above these changes to
explain why we are keeping the lock. That might help avoid similar
problems in the future.

Nathan

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message PG Bug reporting form 2021-12-08 02:45:29 BUG #17325: Unexpected streaming replication protocol bytes for IDENTIFY_SYSTEM command
Previous Message Tom Lane 2021-12-07 14:57:53 Re: When Update balloons memory