Re: WIP: Fast GiST index build

From: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP: Fast GiST index build
Date: 2011-08-30 20:39:23
Message-ID: CAPpHfdsKWB44vQhL99puFxfNwFODEGtoV9vhk5npxC4SYAf97Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Aug 30, 2011 at 9:29 PM, Heikki Linnakangas <
heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:

> On 30.08.2011 13:29, Alexander Korotkov wrote:
>
>> On Tue, Aug 30, 2011 at 1:13 PM, Heikki Linnakangas<
>> heikki(dot)linnakangas(at)**enterprisedb(dot)com<heikki(dot)linnakangas(at)enterprisedb(dot)com>>
>> wrote:
>>
>> So, over 50% of the CPU time is spent in choosing a block from the
>>> temporary files. That should be pretty easy to improve..
>>>
>>> Yes, probably we can just remove free blocks sorting.
>>
>
> Ok, the first results are in for that:
>
> testname | nrows | duration | accesses
> ---------------------------+--**---------+-----------------+--**--------
> points unordered buffered | 250000000 | 06:00:23.707579 | 4049832
>
> From the previous test runs, the unbuffered index build took under 4 hours,
> so even though this is a lot better than with the sorting, it's still a loss
> compared to non-buffered build.
>
> I had vmstat running during most of this index build. At a quick glance, it
> doesn't seem to be CPU bound anymore. I suspect the buffers in the temporary
> file gets very fragmented. Or, we're reading it in backwards order because
> the buffers work in a LIFO fashion. The system seems to be doing about 5
> MB/s of I/O during the build, which sounds like a figure you'd get for more
> or less random I/O.

So, we still have two questions:
1) Why buffering build algorithm doesn't show any benefit on these tests?
2) Why test results on your test setup differs from test results on my test
setup?

I can propose following answers now:
1) I think it's because high overlaps in the tree. As I mentioned before
high overlaps can cause only fraction of the tree to be used for actual
inserts. For comparison, with my split algorithm (which produce almost no
overlaps on uniform dataset) buffering index build took 4 hours, while
regular build is still running (already more than 8 days = 192 hours)!
2) Probably it's because different behavour of OS cache. For example, on my
test setup OS displace unused pages from cache too slowly. Thereby buffering
algorithm showed benefit nevertheless.

Also it seems to me that I start to understand problem of new linear
splitting algorithm. On dataset with 1M rows it produces almost no overlaps
while it produces significant overlaps already on 10M rows (drama!).
Probably nobody tested it on large enough datasets (neither while original
research or before commit). I'll dig it in more details and provide some
testing results.

------
With best regards,
Alexander Korotkov.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2011-08-30 20:44:11 Re: pg_upgrade automatic testing
Previous Message Tom Lane 2011-08-30 20:37:07 Re: spinlocks on HP-UX