Re: REINDEX CONCURRENTLY unexpectedly fails

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Manuel Rigger <rigger(dot)manuel(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: REINDEX CONCURRENTLY unexpectedly fails
Date: 2019-11-20 03:54:08
Message-ID: 20191120035408.GA4243@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Tue, Nov 19, 2019 at 05:36:49PM -0800, Andres Freund wrote:
> On 2019-11-15 17:07:06 +0900, Michael Paquier wrote:
>> So, here is a patch to address all that. I have also added a test for
>> REINDEX SCHEMA on a temporary schema. While looking at the problem, I
>> have found a crash if trying to reindex concurrently an index or a
>> table using a temporary relation from a different session because we
>> have been lacking checks with RELATION_IS_OTHER_TEMP() in the
>> concurrent code paths.

Thanks for providing comments. The next set of minor releases is in
February, so we have some room until that. For now I'll just add this
patch to the next CF as a bug fix to keeo track of it.

> Probably worth fixing separately?

There is already a check for RELATION_IS_OTHER_TEMP() in the
non-concurrent reindex path, so by forcing temp relations to take this
path then the issue gets fixed automatically, with the error message
from reindex_index() generated.

>> Please note that I have not changed index_drop() for the concurrent
>> mode. Per my checks this does not actually cause any direct bugs as
>> this code path just takes care of dropping the dependencies with the
>> index. There could be an argument for changing that on HEAD though,
>> but I am not sure that this is worth the complication either.
>
> I'm extremely doubtful this works correctly. E.g., what prevents the
> tids for index_concurrently_set_dead() from being recycled and pointing
> to a different row after one of the internal transactions?

ON COMMIT DELETE ROWS does a physical truncation of the relation
files. And as DROP INDEX CONCURRENTLY cannot be run in a transaction
block, you would never face a case where you have no past TIDs which
could be referred to when setting the index as invalid. Now I don't
actually object to enforce the non-concurrent path in index_drop() for
*all* temporary relations. Anyway, that makes sense in itself on
performance grounds, similarly to the create path, so did that by
enforcing the flag in index_drop() (doDeletion would be tempting but I
took the problem at its root). And added some tests for the drop path
and an extra assertion.

> I don't think "ignore concurrent index creation" is a good description -
> they're still created. I'd rephrase it as "For temporary tables build
> index non-concurrently", or something in that vein.
>
> I think we also need to mention somewhere that it's actually crucial to
> ignore them, as otherwise we'd run into problems with on commit truncate
> / drop.

Sure. I have expanded the comment section on that.

> Probably worthwhile to centralize check and comments into one place, to
> avoid duplication / them diverging. Perhaps something like
> IndexCreationSupportsConcurrent() or IndexCreationForceNonConcurrent()?

Good idea, done that. It reduces the explanations of the patch to be
in a single location.

>> @@ -2771,6 +2797,9 @@ ReindexRelationConcurrently(Oid relationOid, int options)
>> /* Open relation to get its indexes */
>> heapRelation = table_open(relationOid, ShareUpdateExclusiveLock);
>>
>> + /* Temporary tables are not processed concurrently */
>> + Assert(heapRelation->rd_rel->relpersistence != RELPERSISTENCE_TEMP);
>
> s/are not/can not/?

Okay.

>> +-- Temporary tables with concurrent builds
>> +CREATE TEMP TABLE concur_temp (f1 int, f2 text);
>> +INSERT INTO concur_temp VALUES (1, 'foo'), (2, 'bar');
>> +CREATE INDEX CONCURRENTLY concur_temp_ind ON concur_temp(f1);
>> +DROP TABLE concur_temp;
>> +-- On-commit actions
>> +CREATE TEMP TABLE concur_temp (f1 int, f2 text)
>> + ON COMMIT DELETE ROWS;
>> +INSERT INTO concur_temp VALUES (1, 'foo'), (2, 'bar');
>> +CREATE INDEX CONCURRENTLY concur_temp_ind ON concur_temp(f1);
>> +DROP TABLE concur_temp;
>> --
>
> I'd also add a test for ON COMMIT DROP.

Considered that, but ON COMMIT DROP does not make sense because it
requires a transaction context, which is why I did not add one. And
it seems to me that there is not much value to just check after CIC or
REINDEX's restriction to not run in a transaction block? I added
tests for these two, but I am of the opinion that they don't bring
much.

>> diff --git a/doc/src/sgml/ref/reindex.sgml b/doc/src/sgml/ref/reindex.sgml
>> index 10881ab03a..e5d2b1a06e 100644
>> --- a/doc/src/sgml/ref/reindex.sgml
>> +++ b/doc/src/sgml/ref/reindex.sgml
>> @@ -162,6 +162,9 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } [ CONCURR
>> &mdash; see <xref linkend="sql-reindex-concurrently"
>> endterm="sql-reindex-concurrently-title"/>.
>> </para>
>> + <para>
>> + This option is ignored for temporary relations.
>> + </para>
>> </listitem>
>> </varlistentry>
>>
>
> I think either this needs to be more verbose, explaining that there's no
> harm from the option being ignored, or it should be ignored as an
> implementation detail.

I think that documenting it is good for the end-user as well. Just
attempted something in the updated version for all the sections of the
docs involved.
--
Michael

Attachment Content-Type Size
reindex-conc-temp-v2.patch text/x-diff 14.0 KB

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message PG Bug reporting form 2019-11-20 08:50:29 BUG #16126: Missing the sp_OACreate procedures
Previous Message Kyotaro Horiguchi 2019-11-20 02:11:16 Re: initdb SegFault