Re: index creation order?

From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Allen Landsidel <all(at)biosys(dot)net>, pgsql-performance(at)postgresql(dot)org
Subject: Re: index creation order?
Date: 2003-10-31 17:10:30
Message-ID: 200310310910.30504.josh@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Allen,

> a) CREATE TABLE with no indexes or keys. Run the COPY (fast, ~30min), then
> CREATE INDEX on each column it's needed on, and ALTER TABLE for the pk and
> each fk needed.

Did you ANALYZE after the copy?

> If there isn't a significant difference between all of them, performance
> wise, I think something is dreadfully wrong here. Running "a", the ALTER
> TABLE to add the PK ran for 17 hours and still wasn't finished.

Adding the *primary key* locked up? This seems unlikely; we have a known
problem with *foreign* keys until the current beta. But I've added primary
keys on 20Gb tables and had it complete in a couple of hours. Ignore this
adivice and look for Stephan Szabo's FK patch instead if what you really
meant was that the FK creation locked up.

> shared_buffers = 30000
hmmm ... 236MB ....
> max_fsm_pages = 2000000
2MB, fine ...
> wal_buffers = 128
1MB, also fine ...
> sort_mem = 1310720 (1.2GB)
Problem here. As documented everywhere, sort_mem is allocated *per sort* not
per query, user, or shared. This means that if the "add PK" operation
involves 2 or more sorts (not sure, haven't tested it), then you're
allocating .7GB RAM more than you acutally have. This may be the cause of
your problem, particularly if *anything* is going on concurrent to the load.

> checkpoint_segments = 64
IF you have the disk space (+ 2GB) I'd raise this to 150-300 during the load
operation.

> commit_delay = 20000
> commit_siblings = 2
These settings are for heavy multi-user update activity. They are not useful
for a single-user load, and may even lower performance.

> stats_start_collector = true
> stats_command_string = true
> stats_row_level = true
> stats_block_level = true

If you can do without stats collection during load, I would suggest that you
do so. The above add both RAM and I/O overhead to your operation.

--
Josh Berkus
Aglio Database Solutions
San Francisco

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message William Yu 2003-10-31 17:17:26 Re: Pg+Linux swap use
Previous Message Andrew Sullivan 2003-10-31 17:06:37 Re: Pg+Linux swap use