Re: Performance on inserts

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alfred Perlstein <bright(at)wintelcom(dot)net>
Cc: Jules Bean <jules(at)jellybean(dot)co(dot)uk>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Performance on inserts
Date: 2000-08-25 23:00:22
Message-ID: 9612.967244422@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Alfred Perlstein <bright(at)wintelcom(dot)net> writes:
> A slightly incorrect startscan point is better than starting
> from the beginning every time.

Not for lookups, it isn't ;-).

I did think about making the btree-descending code act differently
for inserts than lookups, but AFAICS that wouldn't buy anything for
this problem, since until you get down to the leaf level you can't
see where there's free space anyway. You'd simply be exchanging the
problem of missing free space that's too far to the right for the
problem of missing free space that's to the left of where you chose
to start looking.

The algorithm does seem to work quite nicely just the way I described
it, although it turns out I was off base about a good probability
setting. I find that something up around 0.99 seems to be good.
Using the same (perhaps overly simplistic) test case:

# tuples inserted 6.5 current+random hack @ 0.99
Time index size Time index size
1536 <1sec 90112 <1sec 106496
3072 1.56 163840 <1sec 188416
6144 3.70 286720 1.40 376832
12288 9.73 532480 2.65 688128
24576 93.26 1024000 5.22 1368064
49152 363.23 2007040 10.34 2727936
98304 22.07 5545984
196608 45.60 11141120
393216 92.53 22290432

I tried probabilities from 0.67 to 0.999 and found that runtimes didn't
vary a whole lot (though this is near the minimum), while index size
consistently got larger as the probability of moving right decreased.
The runtime is nicely linear throughout the range.

The index size increase might look like a bit of a jump, but actually
this is right where we want it to be. The old code was effectively
packing each page as full as it could be under these conditions. That's
not what you want for a btree. Steady-state load factor for btrees is
usually quoted as somewhere around 70%, and this method manages to
approximate that pretty well with a move-right probability of 0.99 or
a bit less.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Lockhart 2000-08-26 01:36:57 Re: Performance on inserts
Previous Message Vince Vielhaber 2000-08-25 22:50:02 Pure ODBMS (fwd)