| From: | Mihail Nikalayeu <mihailnikalayeu(at)gmail(dot)com> |
|---|---|
| To: | Heikki Linnakangas <hlinnaka(at)iki(dot)fi> |
| Cc: | Hannu Krosing <hannuk(at)google(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Antonin Houska <ah(at)cybertec(dot)at>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Sergey Sargsyan <sergey(dot)sargsyan(dot)2001(at)gmail(dot)com>, alvherre(at)kurilemu(dot)de, Andres Freund <andres(at)anarazel(dot)de>, Michael Paquier <michael(at)paquier(dot)xyz>, Andrey Borodin <amborodin86(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com> |
| Subject: | Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements |
| Date: | 2025-12-16 21:58:00 |
| Message-ID: | CADzfLwWUM2H2+daMv9q0+roX_NGe3cPT+AgvQ60Vh_J-e3ts6Q@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hello, Heikki!
On Tue, Dec 16, 2025 at 2:43 PM Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:
> Firstly, I think the STIR approach is the right approach at the high
> level. I don't like the logical decoding idea, for the reasons Matthias
> and Mikhail already mentioned. Maybe there's some synergy with REPACK,
> but it feels different enough that I doubt it. Let's focus on the STIR
> approach.
Thanks for checking that thread.
> In the first transaction that inserts the catalog entry with
> indisready=false, also create a shmem struct. In that struct, we can
> store information about what state the build is in, and whether
> insertions should go to the STIR or to the real index.
Yes, it might look simpler, but from other point of view:
* we need to check that shmem for each index insert (whenever we build
something or not)
* or we need to put something into an index list with information
"write instead of that index into that shmem"
* currently we have some proven mechanics related to transactions,
catalog snapshots, relcache, invalidation etc. Some tricky
synchronization may be required here (to avoid any drift of way
transaction see shmem and relcache).
> As one small incremental improvement, we could use the shmem struct to
> avoid one of the "wait for all transactions" steps in the current
> implementation. In validate_index(), after we mark the index as
> 'indisready' we have to wait for all transactions to finish, to ensure
> that all subsequent insertions have seen the indisready=true change. We
> could avoid that by setting a flag in the shmem struct instead, so that
> all backends would see instantly that the flag is flipped.
That may be tricky. If I set a flag - what if someone checked it 1ns
ago and decided it is not required to write something in the index?
How to ensure that now everyone really knows about it without heavy
locking?
In all current maintenance operations we ensure in some way (by
locking\unlocking a relation or waiting for transactions) everyone has
fresh enough relcache. Don't think we should involve anything special
for the CIC scenario here.
But some universal solution (like ensuring that every other
transaction that had an outdated relcache is ended) may benefit all
related scenarios.
> Improved STIR approach
>
> Here's another proposal using the STIR approach. It's a little different
> from the patches so far:
> ....
> 7. Retail insert all the tuples from the STIR to the index.
Hm, that clever idea...
At the same time my tests show what index scan is light compared to
heap scans (especially second one - it is not paralleled).
> Snapshot refreshing
> -------------------
> - In step 4, while we are building the index, we can periodically get a
> new snapshot, update the cutoff in the shmem struct, and drain the STIR
> of the tuples that are already in it.
But together with snapshot resetting such an approach is still more
effective (in terms of index scan) but feels much more complex,
including some complex locking.
Need to think a little bit here.
Best regards,
Mikhail.
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Corey Huinker | 2025-12-16 22:03:13 | pg_dump: Remove trivial usage of PQExpBuffer |
| Previous Message | Oleg Tkachenko | 2025-12-16 21:55:02 | Re: [BUG] [PATCH] pg_basebackup produces wrong incremental files after relation truncation in segmented tables |