Re: REINDEX INDEX results in a crash for an index of pg_class since 9.6

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: REINDEX INDEX results in a crash for an index of pg_class since 9.6
Date: 2019-05-01 17:06:03
Message-ID: 20190501170603.hx6r3ztyvi3rlgq6@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2019-05-01 12:20:22 -0400, Tom Lane wrote:
> Andres Freund <andres(at)anarazel(dot)de> writes:
> > I see the bug. Turns out we need to figure out another way to solve the
> > assertion triggered by doing catalog updates within
> > RelationSetNewRelfilenode() - we can't just move the
> > SetReindexProcessing() before it. When CCA is enabled, the
> > CommandCounterIncrement() near the tail of RelationSetNewRelfilenode()
> > triggers a rebuild of the catalog entries - but without the
> > SetReindexProcessing() those scans will try to use the index currently
> > being rebuilt.
>
> Yeah. I think what this demonstrates is that REINDEX INDEX has to have
> RelationSetIndexList logic similar to what REINDEX TABLE has, to control
> which indexes get updated when while we're rebuilding an index of
> pg_class. In hindsight that seems glaringly obvious ... I wonder how we
> missed that when we built that infrastructure for REINDEX TABLE?

I'm not sure this is the right short-term answer. Why isn't it, for now,
sufficient to do what I suggested with RelationSetNewRelfilenode() not
doing the CommandCounterIncrement(), and reindex_index() then doing the
SetReindexProcessing() before a CommandCounterIncrement()? That's like
~10 line code change, and a few more with comments.

There is the danger that the current and above approach basically relies
on there not to be any non-inplace updates during reindex. But at the
moment code does take care to use inplace updates
(cf. index_update_stats()).

It's not clear to me whether the approach of using
RelationSetIndexList() in reindex_index() would be meaningfully more
robust against non-inplace updates during reindex either - ISTM we'd
just as well skip the necessary index insertions if we hid the index
being rebuilt. Skipping to-be-rebuilt indexes works for
reindex_relation() because they're going to be rebuilt subsequently (and
thus the missing index rows don't matter) - but it'd not work for
reindexing a single index, because it'll not get the result at a later
stage.

> I'm pretty sure that infrastructure is my fault, so I'll take a
> whack at fixing this.
>
> Did you figure out why this doesn't also happen in HEAD?

It does for me now, at least when just doing a reindex in isolation (CCA
tests would have taken too long last night). I'm not sure why I wasn't
previously able to trigger it and markhor hasn't run yet on master.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2019-05-01 17:09:07 Re: hyrax vs. RelationBuildPartitionDesc
Previous Message Tom Lane 2019-05-01 16:59:10 Re: REINDEX INDEX results in a crash for an index of pg_class since 9.6