Re: {CREATE INDEX, REINDEX} CONCURRENTLY improvements

From: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>
To: Álvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Hamid Akhtar <hamid(dot)akhtar(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: {CREATE INDEX, REINDEX} CONCURRENTLY improvements
Date: 2021-01-18 20:14:03
Message-ID: CAEze2Wh4zbYpixex7dbRJKPGKincVeopRtrA2jZzfRFjerooQQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, 15 Jan 2021 at 15:29, Álvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> wrote:
>
> So one last remaining improvement was to have VACUUM ignore processes
> doing CIC and RC to compute the Xid horizon of tuples to remove. I
> think we can do something simple like the attached patch.

Regarding the patch:

> + if (statusFlags & PROC_IN_SAFE_IC)
> + h->catalog_oldest_nonremovable =
> + TransactionIdOlder(h->catalog_oldest_nonremovable, xmin);
> + else
> + h->data_oldest_nonremovable = h->catalog_oldest_nonremovable =
> + TransactionIdOlder(h->data_oldest_nonremovable, xmin);

Would this not need to be the following? Right now, it resets
potentially older h->catalog_oldest_nonremovable (which is set in the
PROC_IN_SAFE_IC branch).

> + if (statusFlags & PROC_IN_SAFE_IC)
> + h->catalog_oldest_nonremovable =
> + TransactionIdOlder(h->catalog_oldest_nonremovable, xmin);
> + else
> + {
> + h->data_oldest_nonremovable =
> + TransactionIdOlder(h->data_oldest_nonremovable, xmin);
> + h->catalog_oldest_nonremovable =
> + TransactionIdOlder(h->catalog_oldest_nonremovable, xmin);
> + }

Prior to reading the rest of my response: I do not understand the
intricacies of the VACUUM process, nor the systems of db snapshots, so
it's reasonably possible I misunderstand this.
But would this patch not allow for tuples to be created, deleted and
vacuumed away from a table whilst rebuilding an index on that table,
resulting in invalid pointers in the index?

Example:

1.) RI starts
2.) PHASE 2: filling the index:
2.1.) scanning the heap (live tuple is cached)
< tuple is deleted
< last transaction other than RI commits, only snapshot of RI exists
< vacuum drops the tuple, and cannot remove it from the new index
because this new index is not yet populated.
2.2.) sorting tuples
2.3.) index filled with tuples, incl. deleted tuple
3.) PHASE 3: wait for transactions
4.) PHASE 4: validate does not remove the tuple from the index,
because it is not built to do so: it will only insert new tuples.
Tuples that are marked for deletion are removed from the index only
through VACUUM (and optimistic ALL_DEAD detection).

According to my limited knowledge of RI, it requires VACUUM to not run
on the table during the initial index build process (which is
currently guaranteed through the use of a snapshot).

Regards,

Matthias van de Meent

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2021-01-18 20:15:54 Re: search_plan_tree(): handling of non-leaf CustomScanState nodes causes segfault
Previous Message Tom Lane 2021-01-18 20:13:42 Re: [PATCH] Add support for leading/trailing bytea trim()ing