Re: Multiple index builds on same table - in one sweep?

From: Greg Smith <greg(at)2ndQuadrant(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: Multiple index builds on same table - in one sweep?
Date: 2011-04-11 02:29:49
Message-ID: 4DA2679D.5000207@2ndQuadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 04/09/2011 01:23 PM, Chris Ruprecht wrote:
> Maybe, in a future release, somebody will develop something that can create indexes as inactive and have a build tool build and activate them at the same time. Food for thought?
>

Well, the most common case where this sort of thing happens is when
people are using pg_restore to load a dump of an entire database. In
that case, you can use "-j" to run more than one loader job in parallel,
which can easily end up doing a bunch of index builds at once,
particularly at the end. That already works about as well as it can
because of the synchronized scan feature Tom mentioned.

I doubt you'll ever get much traction arguing for something other than
continuing to accelerate that path; correspondingly, making your own
index builds look as much like it as possible is a good practice. Fire
up as many builds as you can stand in parallel and see how many you can
take given the indexes+data involved. It's not clear to me how a create
as inactive strategy could improve on that.

There are some types of index build operations that bottleneck on CPU
operations, and executing several of those in parallel can be a win. At
some point you run out of physical I/O, or the additional memory you're
using starts taking away too much from caching. Once you're at that
point, it's better to build the indexes on another pass, even if it
requires re-scanning the table data to do it. The tipping point varies
based on both system and workload, it's very hard to predict or automate.

--
Greg Smith 2ndQuadrant US greg(at)2ndQuadrant(dot)com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Greg Smith 2011-04-11 02:40:48 Re: optimizer parameters
Previous Message Tomas Vondra 2011-04-10 23:27:58 Re: optimizer parameters