Re: v12.0: interrupt reindex CONCURRENTLY: ccold: ERROR: could not find tuple for parent of relation ...

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Justin Pryzby <pryzby(at)telsasoft(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Andreas Karlsson <andreas(at)proxel(dot)se>
Subject: Re: v12.0: interrupt reindex CONCURRENTLY: ccold: ERROR: could not find tuple for parent of relation ...
Date: 2019-10-28 07:14:41
Message-ID: 20191028071441.GD1687@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Oct 24, 2019 at 01:59:29PM +0900, Michael Paquier wrote:
> Yes, I can confirm the report. In the case of this scenario the
> reindex is waiting for the first transaction to finish before step 5,
> the cancellation causing the follow-up process to not be done
> (set_dead & the next ones). So at this stage the swap has actually
> happened. I am still analyzing the report in depths, but you don't
> have any problems with a plain index when interrupting at this stage,
> and the old index can be cleanly dropped with the new one present, so
> my first thoughts are that we are just missing some more dependency
> cleanup at the swap phase when dealing with a partition index.

Okay, I have found this one. The issue is that at the swap phase
pg_class.relispartition of the new index is updated to use the value
of the old index (true for a partition index), however relispartition
needs to be updated as well for the old index or when trying to
interact with it we get failures as the old index is part of no
inheritance trees. We could use just use false as the index created
concurrently is not attached to a partition with its inheritance links
updated until the swap phase, but it feels more natural to just swap
relispartition for the old and the new index, as per the attached.

This brings also the point that you could just update pg_class to fix
things if you have a broken cluster.

In short, the attached fixes the issue for me, and that's the last bug
I know of in what has been reported..
--
Michael

Attachment Content-Type Size
reindex-conc-relispartition.patch text/x-diff 1.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message 曾文旌 (义从) 2019-10-28 07:15:18 Re: [Proposal] Global temporary tables
Previous Message Amit Kapila 2019-10-28 06:50:34 Re: [HACKERS] Block level parallel vacuum