Proposal for Improving Concurrent Index Creation Performance

From: Sergey Sargsyan <sergey(dot)sargsyan(dot)2001(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Proposal for Improving Concurrent Index Creation Performance
Date: 2025-06-12 03:45:57
Message-ID: CAMAof6_FY0MrNJOuBrqvQqJKiwskFvjRtgpVHf-D7A=KvTtYXg@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi PostgreSQL Hackers,

I've been exploring the process of concurrent index creation and noticed a
potential area for performance improvement, especially for large indexes.
Currently, the process involves multiple stages: creating the index
(initially invalid and not ready), builing the index, validating the index
by checking that all tuples are included, and finally swapping the old
index with the new one.

The validation stage, where we sort the index entries by TID and compare
them to the heap, is currently single-threaded and can become a bottleneck
for large indexes.

To address this, I propose a modification: during the creation of the
index, we can create it as an empty but ready index. This means that while
the index is being built, new transactions will start adding tuples to it
immediately. So during the build stage we can do everything as before, but
instead of building index from scratch from tuples, we will just merge new
tuples into already built index.

More simpler approach could be to do everything as before, but create one
"temporary" index (build empty, ready) alongside with creation of index
itself.
Then basically during the build stage we may build our index as before, and
instead of old validation stage, we can just iterate over our "temp" index,
and move all tuples into main index.

I am mostly interested for such improvement for btree indexes, but i guess
it should work for all of them.

I'm curious if there are any potential pitfalls or reasons this approach
might not work as expected. I'd appreciate any feedback or insights from
the community on this idea.

Thank you!

Best regards,
Sergey Sargsian

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tatsuo Ishii 2025-06-12 03:52:13 Re: [PATCH] Proposal: Improvements to PDF stylesheet and table column widths
Previous Message Hayato Kuroda (Fujitsu) 2025-06-12 03:23:40 RE: Missing program_XXX calling in pgbench tests