Re: {CREATE INDEX, REINDEX} CONCURRENTLY improvements

From: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>
To: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>
Cc: Álvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Hamid Akhtar <hamid(dot)akhtar(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: {CREATE INDEX, REINDEX} CONCURRENTLY improvements
Date: 2021-01-21 20:08:39
Message-ID: CAEze2WjfZPWBW_LeX-gVpRwx=z1T3OZssS4xkNPdBieSDVM1ew@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 19 Jan 2021 at 21:59, Matthias van de Meent
<boekewurm+postgres(at)gmail(dot)com> wrote:
>
> On Mon, 18 Jan 2021, 21:25 Álvaro Herrera, <alvherre(at)alvh(dot)no-ip(dot)org> wrote:
> >
> > On 2021-Jan-18, Matthias van de Meent wrote:
> >
> > > Example:
> > >
> > > 1.) RI starts
> > > 2.) PHASE 2: filling the index:
> > > 2.1.) scanning the heap (live tuple is cached)
> > > < tuple is deleted
> > > < last transaction other than RI commits, only snapshot of RI exists
> > > < vacuum drops the tuple, and cannot remove it from the new index
> > > because this new index is not yet populated.
> > > 2.2.) sorting tuples
> > > 2.3.) index filled with tuples, incl. deleted tuple
> > > 3.) PHASE 3: wait for transactions
> > > 4.) PHASE 4: validate does not remove the tuple from the index,
> > > because it is not built to do so: it will only insert new tuples.
> > > Tuples that are marked for deletion are removed from the index only
> > > through VACUUM (and optimistic ALL_DEAD detection).
> > >
> > > According to my limited knowledge of RI, it requires VACUUM to not run
> > > on the table during the initial index build process (which is
> > > currently guaranteed through the use of a snapshot).
> >
> > VACUUM cannot run concurrently with CIC or RI in a table -- both acquire
> > ShareUpdateExclusiveLock, which conflicts with itself, so this cannot
> > occur.
>
> Yes, you are correct. Vacuum indeed has a ShareUpdateExclusiveLock.
> Are there no other ways that pages are optimistically pruned?
>
> But the base case still stands, ignoring CIC snapshots in would give
> the semantic of all_dead to tuples that are actually still considered
> alive in some context, and should not yet be deleted (you're deleting
> data from an in-use snapshot). Any local pruning optimizations using
> all_dead mechanics now cannot be run on the table unless they hold an
> ShareUpdateExclusiveLock; though I'm unaware of any such mechanisms
> (other than below).

Re-thinking this, and after some research:

Is the behaviour of "any process that invalidates TIDs in this table
(that could be in an index on this table) always holds a lock that
conflicts with CIC/RiC on that table" a requirement of tableams, or is
it an implementation-detail?

If it is a requirement, then this patch is a +1 for me (and that
requirement should be documented in such case), otherwise a -1 while
there is no mechanism in place to remove concurrently-invalidated TIDs
from CIC-ed/RiC-ed indexes.

This concurrently-invalidated check could be done through e.g.
updating validate_index to have one more phase that removes unknown /
incorrect TIDs from the index. As a note: index insertion logic would
then also have to be able to handle duplicate TIDs in the index.

Regards,

Matthias van de Meent

Regards,

Matthias van de Meent

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2021-01-21 20:13:23 Re: [HACKERS] [PATCH] Generic type subscripting
Previous Message Tom Lane 2021-01-21 18:38:44 Re: [PATCH 1/1] Fix detection of pwritev support for OSX.