Quick Links

Re: Configuration Advice

From:	Chris Browne <cbbrowne(at)acm(dot)org>
To:	pgsql-performance(at)postgresql(dot)org
Subject:	Re: Configuration Advice
Date:	2007-01-17 23:08:44
Message-ID:	60irf5metf.fsf@dba2.int.libertyrms.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

cheetah(at)tanabi(dot)org (Steve) writes:
> I'm wondering what we can do to make
> this better if anything; would it be better to leave the indexes on?
> It doesn't seem to be.

Definitely NOT. Generating an index via a bulk sort is a LOT faster
than loading data into an index one tuple at a time.

We saw a BIG increase in performance of Slony-I when, in version
1.1.5, we added a modification that shuts off indexes during COPY and
then does a reindex. Conceivably, you might look at how Slony-I does
that, and try doing the same thing; it might well be faster than doing
a bunch of reindexes serially. (Or not...)

> Would it be better to use INSERTs instead of copies? Doesn't seem
> to be.

I'd be mighty surprised.

> - The load process itself takes about 6 gigs of memory, the rest is
> free for postgres because this is basically all the machine does.

The settings you have do not seem conspicuously wrong in any way.

The one thought which comes to mind is that if you could turn this
into a *mostly* incremental change, that might help.

The thought:

- Load the big chunk of data into a new table

- Generate some minimal set of indices on the new table

- Generate four queries that compare old to new:
q1 - See which tuples are unchanged from yesterday to today
q2 - See which tuples have been deleted from yesterday to today
q3 - See which tuples have been added
q4 - See which tuples have been modified

If the "unchanged" set is extremely large, then you might see benefit
to doing updates based on deleting the rows indicated by q2,
inserting rows based on q3, and updating based on q4.

In principle, computing and applying those 4 queries might be quicker
than rebuilding from scratch.

In principle, applying q2, then q4, then vacuuming, then q3, ought to
be "optimal."
--
let name="cbbrowne" and tld="cbbrowne.com" in name ^ "@" ^ tld;;
http://linuxdatabases.info/info/linux.html
"A 'Cape Cod Salsa' just isn't right." -- Unknown

In response to

Configuration Advice at 2007-01-17 20:41:45 from Steve

Responses

Re: Configuration Advice at 2007-01-18 00:50:11 from Steve

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Scott Marlowe	2007-01-17 23:13:51	Re: Configuration Advice
Previous Message	Tom Lane	2007-01-17 22:38:37	Re: Monitoring Transaction Log size