Re: REINDEX INDEX results in a crash for an index of pg_class since 9.6

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: REINDEX INDEX results in a crash for an index of pg_class since 9.6
Date: 2019-05-02 15:31:07
Message-ID: 20190502153107.o5zadep4ehebzoqy@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2019-05-02 10:49:00 -0400, Tom Lane wrote:
> Andres Freund <andres(at)anarazel(dot)de> writes:
> > On 2019-05-01 22:01:53 -0400, Tom Lane wrote:
> >> I think that argument is pretty pointless considering that "REINDEX TABLE
> >> pg_class" does it this way, and that code is nearly old enough to
> >> vote.
>
> > IMO the reindex_relation() case isn't comparable.
>
> IMV it's the exact same case: we need to perform a pg_class update while
> one or more of pg_class's indexes shouldn't be touched. I am kind of
> wondering why it didn't seem to be necessary to cover this for REINDEX
> INDEX back in 2003, but it clearly is necessary now.
>
> > That's not pretty either :(
>
> So, I don't like your patch, you don't like mine. Anybody else
> want to weigh in?

Well, I think I can live with your fix. I think it's likely to hide
future bugs, but this is an active bug. And, as you say, we don't have a
lot of time.

ISTM that if we go down this path, we should split (not now, but either
still in v12, or *early* in v13), the sets of indexes that are intended
to a) not being used for catalog queries b) may be skipped for index
insertions. It seems pretty likely that somebody will otherwise soon
introduce an heap_update() somewhere into the index build process, and
it'll work just fine in testing due to HOT.

We already have somewhat separate and half complimentary mechanisms
here:
1) When REINDEX_REL_SUPPRESS_INDEX_USE is set (only cluster.c), we mark
indexes on tables as unused by SetReindexPending(). That prevents them
from being used for catalog queries. But it disallows new inserts
into them.

2) When reindex_index() starts processing an index, it marks it as being
processed. Indexes on this list are not alowed to be inserted to
(enforced by assertions). Note that this currently removes the
specific index from the list set by 1).

It also marks the heap as being reindexed, which then triggers (as
the sole effect afaict), some special case logic in
index_update_stats(), that avoids the syscache and opts for a direct
manual catalog scan. I'm a bit confused as to why that's necessary.

3) Just for pg_class, reindex_relation(), just hard-sets the list of
indexes that are alreday rebuilt. This allows index insertions into
the the indexes that are later going to be rebuilt - which is
necessary because we currently update pg_class in
RelationSetNewRelfilenode().

Seems odd to resort to RelationSetIndexList(), when we could just mildly
extend the SetReindexPending() logic instead.

I kinda wonder if there's not a third approach hiding somewhere here. We
could just stop updating pg_class in RelationSetNewRelfilenode() in
pg_class, when it's an index on pg_class. The pg_class changes for
mapped indexes done aren't really crucial, and are going to be
overwritten later by index_update_stats(). That'd have the big
advantage that we'd afaict not need the logic of having to allow
catalog modifications at all during the reindex path at all.

> We do not have the luxury of time to argue about this. If we commit
> something today, we *might* get a useful set of CLOBBER_CACHE_ALWAYS
> results for all branches by Sunday.

Yea. I think I'll also just trigger a manual CCA run of check-world for
all branches (thank god for old workstations). And CCR for at least a
few crucial bits.

> Those regression tests will have to come out of the back branches on
> Sunday, because we are not shipping minor releases with unstable
> regression tests, and I've heard no proposal for avoiding the
> occasional-deadlock problem.

Yea, I've just proposed the same in a separate thread.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2019-05-02 15:32:43 Re: New vacuum option to do only freezing
Previous Message Robert Haas 2019-05-02 15:09:10 Re: New vacuum option to do only freezing