Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY

From: Andres Freund <andres(at)anarazel(dot)de>
To: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Andrey Borodin <x4mmm(at)yandex-team(dot)ru>, Peter Geoghegan <pg(at)bowt(dot)ie>, Michael Paquier <michael(at)paquier(dot)xyz>, Петър Славов <pet(dot)slavov(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
Date: 2022-05-25 17:08:21
Message-ID: 20220525170821.rf6r4dnbbu4baujp@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Hi,

On 2022-05-25 18:43:22 +0200, Alvaro Herrera wrote:
> On 2022-May-25, Robert Haas wrote:
> > Also, it seems like it would require complex new infrastructure that I
> > think we should be reluctant to invent in back branches.
>
> This is definitely true. And I think this would be expensive, because
> we'd have to check in every heap_page_prune call.

I think the cost could be addressed, along the lines of the mechanism I put in
as part of the snapshot scalability work. I.e. don't compute an accurate
horizon when not needed for pruning, only do so when within a certain range of
xids.

But it seems still way too invasive for the back branches. Quite obviously we
need a lot more testing for this etc.

I'm also doubtful it's the right approach. The problem here comes from needing
a snapshot for the entire duration of the validation scan. ISTM that we should
work on not needing that snapshot, rather than trying to reduce the
consequences of holding that snapshot. I think it might be possible to avoid
it. Random sketch:

We could prevent HOT updates during CIC for rows inserted during the first
scan. If we did that we IIRC could rely on the xids of the last row version to
determine whether an index insertion is needed during the validation scan.

> > It seems to me that we should just revert.
>
> Deciding to revert makes me sad, because this feature is extremely
> valuable for users. However, I understand the danger and I don't
> disagree with the rationale so I can't really object.

I sadly don't see how we could develop a reliable reimplementation of this
feature without delaying (or destabilizing) the release...

Greetings,

Andres Freund

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Robert Haas 2022-05-25 17:17:49 Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
Previous Message Tom Lane 2022-05-25 17:02:35 Re: Extension pg_trgm, permissions and pg_dump order