Skip site navigation (1) Skip section navigation (2)

Re: WIP: Fast GiST index build

From: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP: Fast GiST index build
Date: 2011-09-01 09:23:51
Message-ID: CAPpHfdsUTusxjB26cHnbVWCFkxQc87juJsTkkoPUj=GrJzU5ag@mail.gmail.com (view raw or flat)
Thread:
Lists: pgsql-hackers
On Thu, Sep 1, 2011 at 12:59 PM, Heikki Linnakangas <
heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:

> So I changed the test script to generate the table as:
>
> CREATE TABLE points AS SELECT random() as x, random() as y FROM
> generate_series(1, $NROWS);
>
> The unordered results are in:
>
>          testname           |   nrows   |    duration     | accesses
> -----------------------------+**-----------+-----------------+**----------
>  points unordered buffered   | 250000000 | 05:56:58.575789 |  2241050
>  points unordered auto       | 250000000 | 05:34:12.187479 |  2246420
>  points unordered unbuffered | 250000000 | 04:38:48.663952 |  2244228
>
> Although the buffered build doesn't lose as badly as it did with more
> overlap, it still doesn't look good :-(. Any ideas?


But it's still a lot of overlap. It's about 220 accesses per small area
request. It's about 10 - 20 times greater than should be without overlaps.
If we roughly assume that 10 times more overlap makes 1/10 of tree to be
used for actual inserts, then that part of tree can easily fit to the cache.
You can try my splitting algorithm on your test setup (it this case I advice
to start from smaller number of rows, 100 M for example).
I'm requesting real-life datasets which makes troubles in real life from
Oleg. Probably those datasets is even larger or new linear split produce
less overlaps on them.

------
With best regards,
Alexander Korotkov.

In response to

Responses

pgsql-hackers by date

Next:From: Heikki LinnakangasDate: 2011-09-01 09:37:42
Subject: Re: WIP: Fast GiST index build
Previous:From: Heikki LinnakangasDate: 2011-09-01 08:59:11
Subject: Re: WIP: Fast GiST index build

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group