Proposal: Creating multiple indexes on a table using a single full table scan

From: Ильдар <igaraev77(at)yandex(dot)com>
To: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Proposal: Creating multiple indexes on a table using a single full table scan
Date: 2025-09-10 07:31:45
Message-ID: 1459211757489461@mail.yandex.uz
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

<div><div><div>Hello hackers,</div><div> </div><div>I’d like to propose an optimization for index creation on large tables.</div><div> </div><div>Currently, when creating multiple indexes on the same table, each index is built independently, triggering a full table scan per index. This leads to significant redundant I/O, especially for very large tables.</div><div> </div><div>For example, suppose we have a 1 TB table and we need to create 10 indexes. If each index takes 1 hour to build (mostly due to the time spent scanning the table), the total time ends up being around 10 hours. However, the data scan part is largely repeated work.</div><div> </div><div>I believe this process could be optimized by introducing a mechanism that builds multiple indexes using **a single shared full table scan**. This way, the table is read once, and the relevant index data is routed to multiple build pipelines concurrently.</div><div> </div><div>If implemented, this could potentially cut the total index build time in half or better, depending on system resources and the number of indexes.</div><div> </div><div>I’m curious:</div><div>- Has this been discussed before?</div><div>- Are there any technical reasons why this wouldn’t be feasible?</div><div>- Would such a patch be of interest to the community?</div><div> </div><div>Thanks,</div><div>Ildar Garaev</div></div></div>

Attachment Content-Type Size
unknown_filename text/html 1.4 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2025-09-10 07:36:38 Re: Make COPY format extendable: Extract COPY TO format implementations
Previous Message Dilip Kumar 2025-09-10 07:18:21 Re: Incorrect logic in XLogNeedsFlush()