Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements

From: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>
To: Michail Nikolaev <michail(dot)nikolaev(at)gmail(dot)com>
Cc: Melanie Plageman <melanieplageman(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Subject: Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements
Date: 2024-03-05 20:08:08
Message-ID: CAEze2Wh3eSAnXFdY_6roNPb3WD-YsKbNLiKf=cPmAGHkPUd22w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, 21 Feb 2024 at 12:37, Michail Nikolaev
<michail(dot)nikolaev(at)gmail(dot)com> wrote:
>
> Hi!
>
> > How do you suppose this would work differently from a long-lived
> > normal snapshot, which is how it works right now?
>
> Difference in the ability to take new visibility snapshot periodically
> during the second phase with rechecking visibility of tuple according
> to the "reference" snapshot (which is taken only once like now).
> It is the approach from (1) but with a workaround for the issues
> caused by heap_page_prune_opt.
>
> > Would it be exclusively for that relation?
> Yes, only for that affected relation. Other relations are unaffected.

I suppose this could work. We'd also need to be very sure that the
toast relation isn't cleaned up either: Even though that's currently
DELETE+INSERT only and can't apply HOT, it would be an issue if we
couldn't find the TOAST data of a deleted for everyone (but visible to
us) tuple.

Note that disabling cleanup for a relation will also disable cleanup
of tuple versions in that table that are not used for the R/CIC
snapshots, and that'd be an issue, too.

> > How would this be integrated with e.g. heap_page_prune_opt?
> Probably by some flag in RelationData, but not sure here yet.
>
> If the idea looks sane, I could try to extend my POC - it should be
> not too hard, likely (I already have tests to make sure it is
> correct).

I'm not a fan of this approach. Changing visibility and cleanup
semantics to only benefit R/CIC sounds like a pain to work with in
essentially all visibility-related code. I'd much rather have to deal
with another index AM, even if it takes more time: the changes in
semantics will be limited to a new plug in the index AM system and a
behaviour change in R/CIC, rather than behaviour that changes in all
visibility-checking code.

But regardless of second scan snapshots, I think we can worry about
that part at a later moment: The first scan phase is usually the most
expensive and takes the most time of all phases that hold snapshots,
and in the above discussion we agreed that we can already reduce the
time that a snapshot is held during that phase significantly. Sure, it
isn't great that we have to scan the table again with only a single
snapshot, but generally phase 2 doesn't have that much to do (except
when BRIN indexes are involved) so this is likely less of an issue.
And even if it is, we would still have reduced the number of
long-lived snapshots by half.

-Matthias

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alena Rybakina 2024-03-05 20:10:00 Re: Support "Right Semi Join" plan shapes
Previous Message Nathan Bossart 2024-03-05 19:52:12 Re: Switching XLog source from archive to streaming when primary available