Re: Parallel index creation does not properly cleanup after error

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Parallel index creation does not properly cleanup after error
Date: 2018-03-11 19:41:40
Message-ID: CAH2-WzmGuTdu58BgrEcbH8u1Y=r+u9iqFY5GHaQ=t3iZcOZLDQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Mar 11, 2018 at 3:22 AM, David Rowley
<david(dot)rowley(at)2ndquadrant(dot)com> wrote:
> Due to the failure during the index build, it appears that the
> PG_TRY/PG_CATCH block in reindex_relation() causes the reindex_index()
> to abort and jump out to the catch block. Here there's a call to
> ResetReindexPending(), which complains as we're still left in parallel
> mode from the aborted _bt_begin_parallel() call which has called
> EnterParallelMode(), but not managed to make it all the way to
> _bt_end_parallel() (called from btbuild()), where ExitParallelMode()
> is normally called.
>
> Subsequent attempts to refresh the materialized view result in an
> Assert failure in list_member_oid()

Thanks for the report.

> I've not debugged that, but I assume it's because
> pendingReindexedIndexes is left as a non-empty list but has had its
> memory context obliterated due to the previous query having ended.

It's not really related to memory lifetime, so much as a corruption of
the state that tracks reindexed indexes within a backend. This is of
course due to that "cannot modify reindex state during a parallel
operation" error you saw.

> The comment in the following fragment is not well honored by the
> ResetReindexPending() since it does not clear the list if there's an
> error.

> A perhaps simple fix would be just to have ResetReindexPending() only
> reset the list to NIL again and not try to raise any error.

I noticed a very similar bug in ResetReindexProcessing() just before
parallel CREATE INDEX was committed. The fix there was simply not
throwing a "can't happen" error. I agree that the same fix should be
used here. It's not worth enforcing !IsInParallelMode() in the reset
functions; just enforcing !IsInParallelMode() in the set functions is
sufficient. Attached patch does this.

--
Peter Geoghegan

Attachment Content-Type Size
0001-Fix-corruption-of-backend-REINDEX-processing-state.patch text/x-patch 1.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2018-03-11 20:51:40 Re: Using JIT for VACUUM, COPY, ANALYZE
Previous Message Andres Freund 2018-03-11 19:38:54 Re: Using JIT for VACUUM, COPY, ANALYZE