Re: index creation order?

From: Allen Landsidel <all(at)biosys(dot)net>
To: Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: index creation order?
Date: 2003-10-31 18:27:12
Message-ID: 6.0.0.22.0.20031031131506.024864b0@pop.hotpop.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

At 12:10 10/31/2003, Josh Berkus wrote:
>Allen,
>
> > a) CREATE TABLE with no indexes or keys. Run the COPY (fast, ~30min), then
> > CREATE INDEX on each column it's needed on, and ALTER TABLE for the pk and
> > each fk needed.
>
>Did you ANALYZE after the copy?

No, and this was my major mistake. I normally run analyze periodically
from cron, anywhere from once an hour to ever 15 minutes depending on the
db.. I had disabled that for this because I didn't want anything competing
with this stuff for disk I/O.

I followed your other suggestions as well, canceled the index that was
running, analyzed the whole db, and ran the queries again. All of them are
running in under 10 or so minutes after the analyze.

I'll just be adding the PKs and the Indexes, I can add triggers/rules of my
own for the RI, rather than worry about FK creation screwing up.

I had no idea analyze was playing such a big role in this sense.. I really
thought that other than saving space, it wasn't doing much for tables that
don't have indexes on the.

Thanks for the help.

> > shared_buffers = 30000
>hmmm ... 236MB ....
> > max_fsm_pages = 2000000
>2MB, fine ...
> > wal_buffers = 128
>1MB, also fine ...
> > sort_mem = 1310720 (1.2GB)
>Problem here. As documented everywhere, sort_mem is allocated *per sort*
>not
>per query, user, or shared. This means that if the "add PK" operation
>involves 2 or more sorts (not sure, haven't tested it), then you're
>allocating .7GB RAM more than you acutally have. This may be the cause of
>your problem, particularly if *anything* is going on concurrent to the load.

I didn't know this was per-sort per-backend, I thought it was per-backend
for all sorts running on that backend. I've dropped it down to 256MB.

> > checkpoint_segments = 64
>IF you have the disk space (+ 2GB) I'd raise this to 150-300 during the load
>operation.

Done, at 128, which seems to be enough for now. I'll fiddle more with this
later on.

> > commit_delay = 20000
> > commit_siblings = 2
>These settings are for heavy multi-user update activity. They are not useful
>for a single-user load, and may even lower performance.

That's what's going on.. this database I'm working on isn't the only one in
the system, and some things are using different schemas in the database I'm
working on, so this isn't something I can afford to turn off. Most of the
activity is heavy and transient.. many INSERT/UPDATE/DELETE cycles.

Again, thanks for the help, I really do appreciate it. It's gratifying and
depressing to know the last two or so days work could've been compressed
into 3 hours if I'd just run that damn analyze. ;)

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Allen Landsidel 2003-10-31 18:28:44 Re: index creation order?
Previous Message William Yu 2003-10-31 17:17:26 Re: Pg+Linux swap use