Re: REINDEX CONCURRENTLY unexpectedly fails

From: Andres Freund <andres(at)anarazel(dot)de>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Manuel Rigger <rigger(dot)manuel(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: REINDEX CONCURRENTLY unexpectedly fails
Date: 2019-12-12 21:37:09
Message-ID: 20191212213709.neopqccvdo724eha@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Hi,

On 2019-11-20 12:54:08 +0900, Michael Paquier wrote:
> On Tue, Nov 19, 2019 at 05:36:49PM -0800, Andres Freund wrote:
> >> Please note that I have not changed index_drop() for the concurrent
> >> mode. Per my checks this does not actually cause any direct bugs as
> >> this code path just takes care of dropping the dependencies with the
> >> index. There could be an argument for changing that on HEAD though,
> >> but I am not sure that this is worth the complication either.
> >
> > I'm extremely doubtful this works correctly. E.g., what prevents the
> > tids for index_concurrently_set_dead() from being recycled and pointing
> > to a different row after one of the internal transactions?
>
> ON COMMIT DELETE ROWS does a physical truncation of the relation
> files. And as DROP INDEX CONCURRENTLY cannot be run in a transaction
> block, you would never face a case where you have no past TIDs which
> could be referred to when setting the index as invalid.

It's probably not reachable, but it strikes me as really fragile and
dangerous. If e.g. somehow ON COMMIT DROP tables could exist when DROP
CONCURRENTLY were run, the index_concurrently_set_dead() could very well
target a row that's since been deleted in an earlier transaction.

> Now I don't actually object to enforce the non-concurrent path in
> index_drop() for *all* temporary relations. Anyway, that makes sense
> in itself on performance grounds, similarly to the create path, so did
> that by enforcing the flag in index_drop() (doDeletion would be
> tempting but I took the problem at its root). And added some tests
> for the drop path and an extra assertion.

Cool.

I still think we'd be well served to add a few CheckTableNotInUse() type
checks...

> >> +-- Temporary tables with concurrent builds
> >> +CREATE TEMP TABLE concur_temp (f1 int, f2 text);
> >> +INSERT INTO concur_temp VALUES (1, 'foo'), (2, 'bar');
> >> +CREATE INDEX CONCURRENTLY concur_temp_ind ON concur_temp(f1);
> >> +DROP TABLE concur_temp;
> >> +-- On-commit actions
> >> +CREATE TEMP TABLE concur_temp (f1 int, f2 text)
> >> + ON COMMIT DELETE ROWS;
> >> +INSERT INTO concur_temp VALUES (1, 'foo'), (2, 'bar');
> >> +CREATE INDEX CONCURRENTLY concur_temp_ind ON concur_temp(f1);
> >> +DROP TABLE concur_temp;
> >> --
> >
> > I'd also add a test for ON COMMIT DROP.
>
> Considered that, but ON COMMIT DROP does not make sense because it
> requires a transaction context, which is why I did not add one. And
> it seems to me that there is not much value to just check after CIC or
> REINDEX's restriction to not run in a transaction block? I added
> tests for these two, but I am of the opinion that they don't bring
> much.

I think because CIC now falls back to non-concurrent mode, it's
worthwhile to exercise this path. It seems far from unlikely that the
code gets moved around enough that suddenly CIC is allowed in
transactions when targetting temp tables.

> >> diff --git a/doc/src/sgml/ref/reindex.sgml b/doc/src/sgml/ref/reindex.sgml
> >> index 10881ab03a..e5d2b1a06e 100644
> >> --- a/doc/src/sgml/ref/reindex.sgml
> >> +++ b/doc/src/sgml/ref/reindex.sgml
> >> @@ -162,6 +162,9 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } [ CONCURR
> >> &mdash; see <xref linkend="sql-reindex-concurrently"
> >> endterm="sql-reindex-concurrently-title"/>.
> >> </para>
> >> + <para>
> >> + This option is ignored for temporary relations.
> >> + </para>
> >> </listitem>
> >> </varlistentry>
> >>
> >
> > I think either this needs to be more verbose, explaining that there's no
> > harm from the option being ignored, or it should be ignored as an
> > implementation detail.
>
> I think that documenting it is good for the end-user as well.

Why?

> Just attempted something in the updated version for all the sections
> of the docs involved. -- Michael

> diff --git a/src/include/catalog/index.h b/src/include/catalog/index.h
> index 1113d25b2d..04d3d4826f 100644
> --- a/src/include/catalog/index.h
> +++ b/src/include/catalog/index.h
> @@ -113,6 +113,8 @@ extern bool CompareIndexInfo(IndexInfo *info1, IndexInfo *info2,
>
> extern void BuildSpeculativeIndexInfo(Relation index, IndexInfo *ii);
>
> +extern bool RelationSupportsConcurrently(char relpersistence);
> +
> extern void FormIndexDatum(IndexInfo *indexInfo,
> TupleTableSlot *slot,
> EState *estate,
> diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
> index 67f637de11..0012ebf69c 100644
> --- a/src/backend/catalog/index.c
> +++ b/src/backend/catalog/index.c
> @@ -2017,6 +2017,13 @@ index_drop(Oid indexId, bool concurrent, bool concurrent_lock_mode)
> LOCKTAG heaplocktag;
> LOCKMODE lockmode;
>
> + /*
> + * Enforce non-concurrent drop if the relation does not support this
> + * option.
> + */
> + if (!RelationSupportsConcurrently(get_rel_persistence(indexId)))
> + concurrent = false;
> +

Echoing Alvaro, I'm less than convinced by this name.

> + /*
> + * Enforce non-concurrent build if the relation does not support this
> + * option.
> + */
> + if (!RelationSupportsConcurrently(rel->rd_rel->relpersistence))
> + stmt->concurrent = false;
> +

Copying this to some, but not all, the places where
RelationSupportsConcurrently() is called doesn't seem helpful...

> --- a/doc/src/sgml/ref/create_index.sgml
> +++ b/doc/src/sgml/ref/create_index.sgml
> @@ -129,6 +129,11 @@ CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ [ IF NOT EXISTS ] <replaceable class=
> &mdash; see <xref linkend="sql-createindex-concurrently"
> endterm="sql-createindex-concurrently-title"/>.
> </para>
> + <para>
> + This option is ignored for temporary relations as such relations
> + are assigned to the session using them, so they are not subject to
> + problems with concurrent locking issues.
> + </para>
> </listitem>
> </varlistentry>

This strikes me as confusing to users. They won't understand "concurrent
locking issues" (nor am I sure that I really know what precisely that
means). And "temporary relations as such relations are assigned" is also
confusing.

If we want to add docs, I'd say at most something like "For temporary
tables index creation is always non-concurrent, as no other session can
access them, and non-concurrent index creation is cheaper.".

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Andres Freund 2019-12-12 21:59:27 Re: postgres has no spinlock support on riscv rv64imafdc
Previous Message PG Bug reporting form 2019-12-12 20:15:22 BUG #16164: Sending shared secret in all low case