Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements

From: Sergey Sargsyan <sergey(dot)sargsyan(dot)2001(at)gmail(dot)com>
To: Mihail Nikalayeu <mihailnikalayeu(at)gmail(dot)com>
Cc: Álvaro Herrera <alvherre(at)kurilemu(dot)de>, Andres Freund <andres(at)anarazel(dot)de>, Michael Paquier <michael(at)paquier(dot)xyz>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Andrey Borodin <amborodin86(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>
Subject: Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements
Date: 2025-06-16 20:21:47
Message-ID: CAMAof695VA+mbVRhWCTus=E0WnsMAQyqXxfOTohbcb7VUHSP4g@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thank you for the information. Tomorrow, I will also run a few tests to
measure the time required to collect tids from the index; however, since I
do not work with vanilla postgres, the results may vary.

If the results indicate that this procedure is time-consuming, I maybe will
develop an additional patch specifically for b-tree indexes, as they are
the default and most commonly used type.

Best regards,
Sergey

On Mon, Jun 16, 2025, 11:01 PM Mihail Nikalayeu <mihailnikalayeu(at)gmail(dot)com>
wrote:

> Hello, Sergey!
>
> > I think it's to avoid duplicate errors when adding tuples from STIP to
> the main index,
> > but couldn't we just suppress that error during validation and skip the
> new tuple insertion if it already exists?
>
> In some cases, it is not possible:
> – Some index types (GiST, GIN, BRIN) do not provide an easy way to
> detect such duplicates.
> – When we are building a unique index, we cannot simply skip
> duplicates, because doing so would also skip the rows that should
> prevent the unique index from being created (unless we add extra logic
> for B-tree indexes to compare TIDs as well).
>
> > The main index may get huge after building, and iterating over it in a
> single thread and then sorting tids can be time consuming.
> My tests indicate that the overhead is minor compared with the time
> spent scanning the heap and building the index itself.
>
> > At least I guess one can skip it when STIP is empty.
> Yes, that’s a good idea; I’ll add it later.
>
> > p.s. I noticed that `stip.c` has a lot of functions that don't follow
> the Postgres coding style of return type on separate line.
> Hmm... I’ll fix that as well.
>
> Best regards,
> Mikhail.
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2025-06-16 20:23:46 Re: No error checking when reading from file using zstd in pg_dump
Previous Message Dmitry Koval 2025-06-16 20:15:41 Re: Add SPLIT PARTITION/MERGE PARTITIONS commands