Re: REINDEX CONCURRENTLY unexpectedly fails

From: Andres Freund <andres(at)anarazel(dot)de>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Manuel Rigger <rigger(dot)manuel(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: REINDEX CONCURRENTLY unexpectedly fails
Date: 2019-11-20 01:36:49
Message-ID: 20191120013649.2xfr2dnznyjgluch@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Hi,

I'm on vacation till early December, so I'll not respond quickly....

On 2019-11-15 17:07:06 +0900, Michael Paquier wrote:
> So, here is a patch to address all that. I have also added a test for
> REINDEX SCHEMA on a temporary schema. While looking at the problem, I
> have found a crash if trying to reindex concurrently an index or a
> table using a temporary relation from a different session because we
> have been lacking checks with RELATION_IS_OTHER_TEMP() in the
> concurrent code paths.

Probably worth fixing separately?

> Please note that I have not changed index_drop() for the concurrent
> mode. Per my checks this does not actually cause any direct bugs as
> this code path just takes care of dropping the dependencies with the
> index. There could be an argument for changing that on HEAD though,
> but I am not sure that this is worth the complication either.

I'm extremely doubtful this works correctly. E.g., what prevents the
tids for index_concurrently_set_dead() from being recycled and pointing
to a different row after one of the internal transactions?

> diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
> index 374e2d0efe..7de36ca7e2 100644
> --- a/src/backend/commands/indexcmds.c
> +++ b/src/backend/commands/indexcmds.c
> @@ -550,6 +550,15 @@ DefineIndex(Oid relationId,
> lockmode = stmt->concurrent ? ShareUpdateExclusiveLock : ShareLock;
> rel = table_open(relationId, lockmode);
>
> + /*
> + * Ignore concurrent index creation for temporary tables. Such
> + * relations only work with the current session, so they are not
> + * subject to concurrency problems. Using a non-concurrent build
> + * is also more performant.
> + */
> + if (rel->rd_rel->relpersistence == RELPERSISTENCE_TEMP)
> + stmt->concurrent = false;

I don't think "ignore concurrent index creation" is a good description -
they're still created. I'd rephrase it as "For temporary tables build
index non-concurrently", or something in that vein.

I think we also need to mention somewhere that it's actually crucial to
ignore them, as otherwise we'd run into problems with on commit truncate
/ drop.

Probably worthwhile to centralize check and comments into one place, to
avoid duplication / them diverging. Perhaps something like
IndexCreationSupportsConcurrent() or IndexCreationForceNonConcurrent()?

> @@ -2771,6 +2797,9 @@ ReindexRelationConcurrently(Oid relationOid, int options)
> /* Open relation to get its indexes */
> heapRelation = table_open(relationOid, ShareUpdateExclusiveLock);
>
> + /* Temporary tables are not processed concurrently */
> + Assert(heapRelation->rd_rel->relpersistence != RELPERSISTENCE_TEMP);

s/are not/can not/?

> +-- Temporary tables with concurrent builds
> +CREATE TEMP TABLE concur_temp (f1 int, f2 text);
> +INSERT INTO concur_temp VALUES (1, 'foo'), (2, 'bar');
> +CREATE INDEX CONCURRENTLY concur_temp_ind ON concur_temp(f1);
> +DROP TABLE concur_temp;
> +-- On-commit actions
> +CREATE TEMP TABLE concur_temp (f1 int, f2 text)
> + ON COMMIT DELETE ROWS;
> +INSERT INTO concur_temp VALUES (1, 'foo'), (2, 'bar');
> +CREATE INDEX CONCURRENTLY concur_temp_ind ON concur_temp(f1);
> +DROP TABLE concur_temp;
> --

I'd also add a test for ON COMMIT DROP.

> -- Try some concurrent index drops
> --

DROP INDEX CONCURRENTLY likely has nearly exactly the same problem,
right?

> diff --git a/doc/src/sgml/ref/create_index.sgml b/doc/src/sgml/ref/create_index.sgml
> index 629a31ef79..e26f450846 100644
> --- a/doc/src/sgml/ref/create_index.sgml
> +++ b/doc/src/sgml/ref/create_index.sgml
> @@ -129,6 +129,9 @@ CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ [ IF NOT EXISTS ] <replaceable class=
> &mdash; see <xref linkend="sql-createindex-concurrently"
> endterm="sql-createindex-concurrently-title"/>.
> </para>
> + <para>
> + This option is ignored for temporary relations.
> + </para>
> </listitem>
> </varlistentry>
>
> diff --git a/doc/src/sgml/ref/reindex.sgml b/doc/src/sgml/ref/reindex.sgml
> index 10881ab03a..e5d2b1a06e 100644
> --- a/doc/src/sgml/ref/reindex.sgml
> +++ b/doc/src/sgml/ref/reindex.sgml
> @@ -162,6 +162,9 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } [ CONCURR
> &mdash; see <xref linkend="sql-reindex-concurrently"
> endterm="sql-reindex-concurrently-title"/>.
> </para>
> + <para>
> + This option is ignored for temporary relations.
> + </para>
> </listitem>
> </varlistentry>
>

I think either this needs to be more verbose, explaining that there's no
harm from the option being ignored, or it should be ignored as an
implementation detail.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Kyotaro Horiguchi 2019-11-20 02:11:16 Re: initdb SegFault
Previous Message Thomas Munro 2019-11-20 00:05:56 Re: BUG #16104: Invalid DSA Memory Alloc Request in Parallel Hash