Re: REINDEX INDEX results in a crash for an index of pg_class since 9.6

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: REINDEX INDEX results in a crash for an index of pg_class since 9.6
Date: 2019-04-26 02:02:48
Message-ID: 20190426020248.iyev66bkpf2abxqd@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2019-04-25 17:12:33 -0400, Tom Lane wrote:
> Andres Freund <andres(at)anarazel(dot)de> writes:
> > On 2019-04-25 16:02:03 -0400, Tom Lane wrote:
> >> That could work. The important API spec is then that the relcache entry
> >> reflects the *previous* state of the relation, and is not to be modified
> >> by the tableam call.
>
> > Right.
>
> > I was wondering if we should just pass in the pg_class tuple as an "out"
> > argument, instead of pointers to relfilnode/relfrozenxid/relminmxid.
>
> Yeah, possibly. The whole business with xids is perhaps heapam specific,
> so decoupling this function's signature from them would be good.

I've left that out in the attached. Currently VACUUM FULL / CLUSTER also
needs to handle those, and the callback for transactional rewrite
(table_relation_copy_for_cluster()), also returns those as output
parameter. I think I can see a way how we could clean up the relevant
cluster.c code, but until that's done, I don't see much point in a
different interface (I'll probably write apatch .

The attached patch fixes the problem for me, and passes all existing
tests. It contains a few changes that are not strictly necessary, but
imo clear improvements.

We probably could split the tableam changes and related refactoring from
the fix to make backpatching simpler. I've not done that yet, but I
think we should before committing.

Questions:
- Should we move the the CommandCounterIncrement() from
RelationSetNewRelfilenode() to the callers? That'd allow them to do
other things to the new relation (e.g. fill it), before making the
changes visible. Don't think it'd currently help, but it seems like it
could make code more robust in the future.

- Should we introduce an assertion into CatalogIndexInsert()'s
HeapTupleIsHeapOnly() path, that asserts that all the relevant indexes
aren't ReindexIsProcessingIndex()? Otherwise it seems way too easy to
re-introduce bugs like this one. Dirty hack for that included.

- Wonder if we shouldn't introduce something akin to
SetReindexProcessing() for table rewrites (e.g. VACUUM FULL), to
prevent the related error of inserting/updating a catalog table that's
currently being rewritten.

Taking this as a WIP, what do you think?

Greetings,

Andres Freund

Attachment Content-Type Size
wip.diff text/x-diff 17.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2019-04-26 02:23:09 Re: Reducing the runtime of the core regression tests
Previous Message Matsumura, Ryo 2019-04-26 01:51:53 RE: Patch: doc for pg_logical_emit_message()