Re: Using multiple cores for index creation?

From: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: henk de wit <henk53602(at)hotmail(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Using multiple cores for index creation?
Date: 2009-01-29 23:19:42
Message-ID: 1233271182.18633.6.camel@jd-laptop.pragmaticzealot.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Thu, 2009-01-29 at 18:09 -0500, Robert Haas wrote:
> On Thu, Jan 29, 2009 at 3:21 PM, henk de wit <henk53602(at)hotmail(dot)com> wrote:
> > Hi,
> > When I try to restore a database dump on PostgreSQL 8.3
> > that's approximately 130GB in size and takes about 1 hour, I noticed index
> > creation makes up the bulk of that time. I'm using a very fast I/O subsystem
> > (16 Mtron Pro 7535 SSDs using a dual 1.2Ghz IOP/4GB cache RAID controller),
> > fast CPUs (2 quad core C2Q's at 2.6Ghz) and 32GB RAM. From monitoring the
> > restore process, I learned that only 10 minutes is spend doing IO, while the
> > rest of the time is spend on creating the indexes. Index creation seems to
> > be completely CPU bound.
> > The problem is that only 1 CPU core is used. My other 7 cores are just
> > sitting there doing nothing. It seems to me that creating each index,
> > especially for different tables, is something that can be done
> > independently.
> > Is there some way I can let PostgreSQL use multiple cores for creating the
> > indexes?
>
> Andrew Dunstan has been working on this problem. His latest parallel
> restore patch can be found here:
>
> http://archives.postgresql.org/message-id/4977E070.6070604@dunslane.net

Yeah but that isn't useful for 8.3. What can be done in this specific
situation is to make sure you dump with the -Fc option. You can then
pull a TOC out with pg_restore and break that appart. Reading the TOC is
pretty self evident. Once you get down to index creation you can create
multiple files each with a group of indexes to create. Then call
pg_restore multiple times in a script against the individual TOC and you
will use all cores.

Joshua D. Drake

P.S. Increase maintenance_work_mem can help too

>
> ...Robert
>
--
PostgreSQL - XMPP: jdrake(at)jabber(dot)postgresql(dot)org
Consulting, Development, Support, Training
503-667-4564 - http://www.commandprompt.com/
The PostgreSQL Company, serving since 1997

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Robert Haas 2009-01-29 23:21:17 Re: Sort performance
Previous Message Robert Haas 2009-01-29 23:09:53 Re: Using multiple cores for index creation?