Re: REINDEX CONCURRENTLY unexpectedly fails

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Manuel Rigger <rigger(dot)manuel(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: REINDEX CONCURRENTLY unexpectedly fails
Date: 2020-01-09 03:06:19
Message-ID: 20200109030619.GC2251@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Wed, Jan 08, 2020 at 05:19:30PM +0900, Michael Paquier wrote:
> I have spent a couple of hours poking at this code, and found two
> problems:
> 1) The error reporting for PROGRESS_CREATEIDX_COMMAND would report a
> concurrent build, but that's not the case if the work happens for a
> temporary table in DefineIndex(), so the call to
> RelationSupportsConcurrentIndexing needs to happen before any look at
> the concurrent flag is done. That's easy enough to fix.
> 2) The handling of the patch within index_drop is too weak. As
> presented, the patch first locks the OID using a RangeVar. However
> for a temporary relation we would first take ShareUpdateExclusiveLock
> RemoveRelations() and then upgrade to a AccessExclusiveLock in
> index_drop(). I think that actually the check in index_drop() is not
> necessary, and that instead we had better do three things:
> a) In RangeVarCallbackForDropRelation(), if the relation is temporary,
> use AccessExclusiveLock all the time, and we know the OID of the
> relation here.
> b) After locking the OID with the RangeVar, re-check if the relation
> is temporary, and then remove PERFORM_DELETION_CONCURRENTLY is.
> c) Add an assertion in index_drop() to be sure that this code path is
> never invoked concurrently with a temporary relation.
>
> I am lacking of time today, I'll continue tomorrow.

Okay, so here is an updated patch fixing those issues, with several
modifications done to the patch (docs, updates for the assertions,
some redesign). Considering again those aspects, I have come up with
the same conclusion as what's stated above, though you actually need
to make sure that it is RangeVarGetRelidExtended() that has to be
careful about the lock to use on the temporary relation, before
anything else is done. The callback RangeVarCallbackForDropRelation()
also needs to be careful about the relation it looks at and check if
the relation supports concurrent indexing. On the other hand, we
could also say that we don't care about lock upgrade risks when
working on temporary tables because these are not accessed by other
sessions, though that's not a sane base to rely on IMO. A solution
involving RangeVarGetRelidExtended() feels also like a sledgehammer to
smash a peanut, because it has a wider impact. If lock upgrade risks
are not worth bothering, this needs to be clearly documented in the
patch with more comments.

As the patch has been heavily modified, I am switching it back to
"Needs Review" for now and I'd like to discuss more about the lock
upgrade risks, particularly if it is considered worth the effort for
temporary relations. Thoughts are welcome.
--
Michael

Attachment Content-Type Size
reindex-conc-temp-v4.patch text/x-diff 17.2 KB

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Michael Paquier 2020-01-09 04:06:12 Re: Assert failure due to "drop schema pg_temp_3 cascade" for temporary tables and \d+ is not showing any info after drooping temp table schema
Previous Message Prathmesh Agarwadekar 2020-01-09 00:26:33 Error while trying to open pgadmin