Re: Performance on inserts

From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Jules Bean <jules(at)jellybean(dot)co(dot)uk>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alfred Perlstein <bright(at)wintelcom(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Performance on inserts
Date: 2000-10-15 21:44:52
Message-ID: 200010152144.RAA15769@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> > 98304 22.07 5545984
> > 196608 45.60 11141120
> > 393216 92.53 22290432
> >
> > I tried probabilities from 0.67 to 0.999 and found that runtimes didn't
> > vary a whole lot (though this is near the minimum), while index size
> > consistently got larger as the probability of moving right decreased.
> > The runtime is nicely linear throughout the range.
>
> That looks brilliant!! (Bearing in mind that I have over 10 million
> tuples in my table, you can imagine what performance was like for me!)
> Is there any chance you could generate a patch against released 7.0.2
> to add just this functionality... It would be the kiss of life for my
> code!
>
> (Not in a hurry, I'm not back in work until Wednesday, as it happens)
>
> And, of course, what would /really/ get my code going speedily would
> be the partial indices mentioned elsewhere in this thread. If the
> backend could automagically drop keys containing > 10% (tunable) of
> the rows from the index, then my index would be (a) about 70% smaller!
> and (b) only used when it's faster. [This means it would have to
> update some simple histogram data. However, I can't see that being
> much of an overhead]
>
> For the short term, if I can get a working version of the above
> randomisation patch, I think I shall 'fake' a partial index by
> manually setting 'enable_seqscan=off' for all but the 4 or 5 most
> common categories. Those two factors combined will speed up my bulk
> inserts a lot.

What would be really nifty is to take the most common value found by
VACUUM ANALYZE, and cause sequential scans if that value represents more
than 50% of the entries in the table.

Added to TODO:

* Prevent index lookups (or index entries using partial index) on most
common values; instead use sequential scan

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2000-10-15 21:50:35 Re: Performance on inserts
Previous Message Bruce Momjian 2000-10-15 21:37:50 Re: Performance on inserts