<div><div><div>Hello hackers,</div><div> </div><div>I’d like to propose an optimization for index creation on large tables.</div><div> </div><div>Currently, when creating multiple indexes on the same table, each index is built independently, triggering a full table scan per index. This leads to significant redundant I/O, especially for very large tables.</div><div> </div><div>For example, suppose we have a 1 TB table and we need to create 10 indexes. If each index takes 1 hour to build (mostly due to the time spent scanning the table), the total time ends up being around 10 hours. However, the data scan part is largely repeated work.</div><div> </div><div>I believe this process could be optimized by introducing a mechanism that builds multiple indexes using **a single shared full table scan**. This way, the table is read once, and the relevant index data is routed to multiple build pipelines concurrently.</div><div> </div><div>If implemented, this could potentially cut the total index build time in half or better, depending on system resources and the number of indexes.</div><div> </div><div>I’m curious:</div><div>- Has this been discussed before?</div><div>- Are there any technical reasons why this wouldn’t be feasible?</div><div>- Would such a patch be of interest to the community?</div><div> </div><div>Thanks,</div><div>Ildar Garaev</div></div></div>