Re: WIP: Fast GiST index build

From: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP: Fast GiST index build
Date: 2011-08-11 20:30:06
Message-ID: CAPpHfdtjfAYSLAgstT3oTb7v4t8XZCfHwMm1O2d2KVuDnNuS3g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Aug 11, 2011 at 2:28 PM, Heikki Linnakangas <
heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:

> On 10.08.2011 22:44, Alexander Korotkov wrote:
>
>> Manual and readme updates.
>>
>
> Thanks, I'm reviewing these now.
>
> Do we want to expose the level-step and buffersize parameters to users?
> They've been useful during testing, but I'm thinking we should be able to
> guess good enough values for them automatically, and just remove the
> options. It's pretty much impossible for a user to tune them correctly, it
> would require deep knowledge of the buffering algorithm.
>
> I'm thinking that even when you explicitly turn buffering on, we should
> still process the first 10000 or so tuples with simple inserts. That way we
> always have a sample of tuples to calculate the average tuple size from.
> It's plausible that if the input data is ordered, looking at the first N
> tuples will give skewed sample, but I don't think there's much danger of
> that in practice. Even if the data is ordered, the length of GiST tuples
> shouldn't vary much.
>
> What happens if we get the levelstep and pagesPerBuffer estimates wrong?
> How sensitive is the algorithm to that? Or will we run out of memory? Would
> it be feasible to adjust those in the middle of the index build, if we e.g
> exceed the estimated memory usage greatly?

I see the following risks.

For levelstep:
Too small: not so great IO benefit as can be
Too large:
1) If subtree doesn't fit effective_cache, much more IO then should be
(because of cache misses during buffer emptying)
2) If last pages of buffers don't fit to maintenance_work_mem, possible
OOM

For buffersize:
Too small: less IO benefit, becuse buffer size is relatively small in
comparison with sub-tree size.
Too large: greater CPU overhead (because of more penalty calls) then can be
with same IO benefit.

Thereby I propose following.
1) Too large levelstep is greatest risk. Let's use pessimistic estimate for
it. Pessimistic estimate has following logic:
largest sub-tree => maximal tuples per page => minimal tuple size
Thereby always using minimal tuple size in levelstep calculation we exclude
greatest risks.
2) Risks of buffersize are comparable and not too critical. Thats why I
propose to use size of first 10000 tuples for estimate.

------
With best regards,
Alexander Korotkov.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2011-08-11 20:49:28 Re: Reduce WAL logging of INSERT SELECT
Previous Message Robert Haas 2011-08-11 20:06:08 index-only scans