Skip site navigation (1) Skip section navigation (2)

Re: index creation order?

From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Allen Landsidel <all(at)biosys(dot)net>,pgsql-performance(at)postgresql(dot)org
Subject: Re: index creation order?
Date: 2003-10-31 17:10:30
Message-ID: (view raw, whole thread or download thread mbox)
Lists: pgsql-performance

> a) CREATE TABLE with no indexes or keys.  Run the COPY (fast, ~30min), then
> CREATE INDEX on each column it's needed on, and ALTER TABLE for the pk and
> each fk needed.

Did you ANALYZE after the copy?

> If there isn't a significant difference between all of them, performance
> wise, I think something is dreadfully wrong here.  Running "a", the ALTER
> TABLE to add the PK ran for 17 hours and still wasn't finished.

Adding the *primary key* locked up?   This seems unlikely; we have a known 
problem with *foreign* keys until the current beta.  But I've added primary 
keys on 20Gb tables and had it complete in a couple of hours.  Ignore this 
adivice and look for Stephan Szabo's FK patch instead if what you really 
meant was that the FK creation locked up.

> shared_buffers = 30000
hmmm ... 236MB ....
> max_fsm_pages = 2000000
2MB, fine ...
> wal_buffers = 128
1MB, also fine ...
> sort_mem = 1310720 (1.2GB)
Problem here.   As documented everywhere, sort_mem is allocated *per sort* not 
per query, user, or shared.   This means that if the "add PK" operation 
involves 2 or more sorts (not sure, haven't tested it), then you're 
allocating .7GB RAM more than you acutally have.  This may be the cause of 
your problem, particularly if *anything* is going on concurrent to the load.

> checkpoint_segments = 64
IF you have the disk space (+ 2GB) I'd raise this to 150-300 during the load 

> commit_delay = 20000
> commit_siblings = 2
These settings are for heavy multi-user update activity. They are not useful 
for a single-user load, and may even lower performance.

> stats_start_collector = true
> stats_command_string = true
> stats_row_level = true
> stats_block_level = true

If you can do without stats collection during load, I would suggest that you 
do so.  The above add both RAM and I/O overhead to your operation.

Josh Berkus
Aglio Database Solutions
San Francisco

In response to


pgsql-performance by date

Next:From: William YuDate: 2003-10-31 17:17:26
Subject: Re: Pg+Linux swap use
Previous:From: Andrew SullivanDate: 2003-10-31 17:06:37
Subject: Re: Pg+Linux swap use

Privacy Policy | About PostgreSQL
Copyright © 1996-2017 The PostgreSQL Global Development Group