Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements

From: Sergey Sargsyan <sergey(dot)sargsyan(dot)2001(at)gmail(dot)com>
To: Mihail Nikalayeu <mihailnikalayeu(at)gmail(dot)com>
Cc: Álvaro Herrera <alvherre(at)kurilemu(dot)de>, Andres Freund <andres(at)anarazel(dot)de>, Michael Paquier <michael(at)paquier(dot)xyz>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Andrey Borodin <amborodin86(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>
Subject: Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements
Date: 2025-06-16 16:17:33
Message-ID: CAMAof6-4xaV3QE2ErYJaJhu6qjFn99sWyo_HQeBhHikZM3GexA@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hey Mihail,

I've started looking at the patches today, mostly the STIR part. Seems
solid, but I've got a question about validation. Why are we still grabbing
tids from the main index and sorting them?

I think it's to avoid duplicate errors when adding tuples from STIP to the
main index, but couldn't we just suppress that error during validation and
skip the new tuple insertion if it already exists?

The main index may get huge after building, and iterating over it in a
single thread and then sorting tids can be time consuming.

At least I guess one can skip it when STIP is empty. But, I think we could
skip it altogether by figuring out what to do with duplicates, making
concurrent and non-concurrent index creation almost identical in speed
(only locking and atomicity would differ).

p.s. I noticed that `stip.c` has a lot of functions that don't follow the
Postgres coding style of return type on separate line.

On Mon, Jun 16, 2025, 6:41 PM Mihail Nikalayeu <mihailnikalayeu(at)gmail(dot)com>
wrote:

> Hello, everyone!
>
> Rebased, patch structure and comments available here [0]. Quick
> introduction poster - here [1].
>
> Best regards,
> Mikhail.
>
> [0]:
> https://www.postgresql.org/message-id/flat/CADzfLwVOcZ9mg8gOG%2BKXWurt%3DMHRcqNv3XSECYoXyM3ENrxyfQ%40mail.gmail.com#52c97e004b8f628473124c05e3bf2da1
> [1]:
> https://www.postgresql.org/message-id/attachment/176651/STIR-poster.pdf
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2025-06-16 16:46:29 Returning nbtree posting list TIDs in DESC order during backwards scans
Previous Message Tom Lane 2025-06-16 15:59:06 Re: Per-role disabling of LEAKPROOF requirements for row-level security?