Skip site navigation (1) Skip section navigation (2)

Re: index creation order?

From: Allen Landsidel <all(at)biosys(dot)net>
To: Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: index creation order?
Date: 2003-10-31 18:27:12
Message-ID: 6.0.0.22.0.20031031131506.024864b0@pop.hotpop.com (view raw or flat)
Thread:
Lists: pgsql-performance
At 12:10 10/31/2003, Josh Berkus wrote:
>Allen,
>
> > a) CREATE TABLE with no indexes or keys.  Run the COPY (fast, ~30min), then
> > CREATE INDEX on each column it's needed on, and ALTER TABLE for the pk and
> > each fk needed.
>
>Did you ANALYZE after the copy?

No, and this was my major mistake.  I normally run analyze periodically 
from cron, anywhere from once an hour to ever 15 minutes depending on the 
db.. I had disabled that for this because I didn't want anything competing 
with this stuff for disk I/O.

I followed your other suggestions as well, canceled the index that was 
running, analyzed the whole db, and ran the queries again.  All of them are 
running in under 10 or so minutes after the analyze.

I'll just be adding the PKs and the Indexes, I can add triggers/rules of my 
own for the RI, rather than worry about FK creation screwing up.

I had no idea analyze was playing such a big role in this sense.. I really 
thought that other than saving space, it wasn't doing much for tables that 
don't have indexes on the.

Thanks for the help.

> > shared_buffers = 30000
>hmmm ... 236MB ....
> > max_fsm_pages = 2000000
>2MB, fine ...
> > wal_buffers = 128
>1MB, also fine ...
> > sort_mem = 1310720 (1.2GB)
>Problem here.   As documented everywhere, sort_mem is allocated *per sort* 
>not
>per query, user, or shared.   This means that if the "add PK" operation
>involves 2 or more sorts (not sure, haven't tested it), then you're
>allocating .7GB RAM more than you acutally have.  This may be the cause of
>your problem, particularly if *anything* is going on concurrent to the load.

I didn't know this was per-sort per-backend, I thought it was per-backend 
for all sorts running on that backend.  I've dropped it down to 256MB.

> > checkpoint_segments = 64
>IF you have the disk space (+ 2GB) I'd raise this to 150-300 during the load
>operation.

Done, at 128, which seems to be enough for now.  I'll fiddle more with this 
later on.

> > commit_delay = 20000
> > commit_siblings = 2
>These settings are for heavy multi-user update activity. They are not useful
>for a single-user load, and may even lower performance.

That's what's going on.. this database I'm working on isn't the only one in 
the system, and some things are using different schemas in the database I'm 
working on, so this isn't something I can afford to turn off.  Most of the 
activity is heavy and transient.. many INSERT/UPDATE/DELETE cycles.

Again, thanks for the help, I really do appreciate it.  It's gratifying and 
depressing to know the last two or so days work could've been compressed 
into 3 hours if I'd just run that damn analyze. ;)




In response to

Responses

pgsql-performance by date

Next:From: Allen LandsidelDate: 2003-10-31 18:28:44
Subject: Re: index creation order?
Previous:From: William YuDate: 2003-10-31 17:17:26
Subject: Re: Pg+Linux swap use

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group