Re: Suggestion for concurrent index creation using a single full scan operation

From: Greg Stark <stark(at)mit(dot)edu>
To: Tim Kane <tim(dot)kane(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Suggestion for concurrent index creation using a single full scan operation
Date: 2013-07-23 13:17:13
Message-ID: CAM-w4HPb9J6f8i4SDac3ChF1dKrD3Pq6tPtEd_2hu_fbRJ3GZg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

We already do this in pg_restore by starting multiple worker processes.
Those processes should get the benefit of synchronised sequential scans.

The way the api for indexes works y wouldn't really be hard to start
multiple parallel index builds. I'm not sure how well the pg_restore thing
works and sometimes the goal isn't to maximise the speed so starting
multiple processes isn't always ideal. It might sometimes be interesting
to be able to do it explicit in a single process.

--
greg
On Jul 23, 2013 1:06 PM, "Tim Kane" <tim(dot)kane(at)gmail(dot)com> wrote:

> Hi all,
>
> I haven't given this a lot of thought, but it struck me that when
> rebuilding tables (be it for a restore process, or some other operational
> activity) - there is more often than not a need to build an index or two,
> sometimes many indexes, against the same relation.
>
> It strikes me that in order to build just one index, we probably need to
> perform a full table scan (in a lot of cases). If we are building
> multiple indexes sequentially against that same table, then we're probably
> performing multiple sequential scans in succession, once for each index.
>
> Could we architect a mechanism that allowed multiple index creation
> statements to execute concurrently, with all of their inputs fed directly
> from a single sequential scan against the full relation?
>
> From a language construct point of view, this may not be trivial to
> implement for raw/interactive SQL - but possibly this is a candidate for
> the custom format restore?
>
> I presume this would substantially increase the memory overhead required
> to build those indexes, though the performance gains may be advantageous.
>
> Feel free to shoot holes through this :)
>
> Apologies in advance if this is not the correct forum for suggestions..
>
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephen Frost 2013-07-23 13:21:54 Re: make --silent
Previous Message Robert Haas 2013-07-23 13:06:13 Re: [9.4 CF 1] And then there were 5