Skip site navigation (1) Skip section navigation (2)

Re: Performance on inserts

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Jules Bean <jules(at)jellybean(dot)co(dot)uk>
Cc: Alfred Perlstein <bright(at)wintelcom(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Performance on inserts
Date: 2000-08-26 16:32:20
Message-ID: 3180.967307540@sss.pgh.pa.us (view raw or flat)
Thread:
Lists: pgsql-hackers
Jules Bean <jules(at)jellybean(dot)co(dot)uk> writes:
> Is there any chance you could generate a patch against released 7.0.2
> to add just this functionality... It would be the kiss of life for my
> code!

Will look at it.  Are you brave enough to want to try the rest of the
7.1 rewrite of the btree code, or do you just want this one hack?

> And, of course, what would /really/ get my code going speedily would
> be the partial indices mentioned elsewhere in this thread.  If the
> backend could automagically drop keys containing > 10% (tunable) of
> the rows from the index, then my index would be (a) about 70% smaller!

I don't think anyone was envisioning "automagic" drop of most common
values.  The partial-index support that's partially there ;-) is
designed around manual specification of a predicate, ie, you'd say

	CREATE INDEX mypartialindex ON table (column)
		WHERE column != 42 AND column != 1066

if you wanted a partial index omitting values 42 and 1066.  The backend
would then consider using the index to process queries wherein it can
prove that the query's WHERE implies the index predicate.  For example

	SELECT * FROM table WHERE column = 11

would be able to use this index but

	SELECT * FROM table WHERE column < 100

would not.

You could certainly write a little periodic-maintenance script to
determine the most common values in your tables and recreate your
partial indexes accordingly ... but I doubt it'd make sense to try
to get the system to do that automatically on-the-fly.

> For the short term, if I can get a working version of the above
> randomisation patch, I think I shall 'fake' a partial index by
> manually setting 'enable_seqscan=off' for all but the 4 or 5 most
> common categories. Those two factors combined will speed up my bulk
> inserts a lot.

Uh, enable_seqscan has nothing to do with how inserts are handled...

> Is there any simple way for Pg to combine inserts into one bulk?

COPY.

> Specifically, their effect on the index files.

This particular problem couldn't be cured by batching inserts anyway.
The performance problem was coming from the actual act of inserting
a key (or more specifically, making room for the key) and that's just
got to be done for each key AFAICS.

			regards, tom lane

In response to

Responses

pgsql-hackers by date

Next:From: Thomas LockhartDate: 2000-08-26 16:33:02
Subject: Re: Proposal for supporting outer joins in 7.1
Previous:From: ChrisDate: 2000-08-26 16:17:56
Subject: Re: AW: How Do You Pronounce "PostgreSQL"?

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group