Re: Yet another fast GiST build

From: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
To: Andrey Borodin <x4mmm(at)yandex-team(dot)ru>
Cc: Darafei Komяpa Praliaskouski <me(at)komzpa(dot)net>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Yet another fast GiST build
Date: 2019-08-29 22:47:49
Message-ID: CAPpHfdsH04kPKRVTtb8aesfqDMMTS5cNHHk5fjPqQZc6p_HpjA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Aug 26, 2019 at 10:59 AM Andrey Borodin <x4mmm(at)yandex-team(dot)ru> wrote:
> In many cases GiST index can be build fast using z-order sorting.
>
> I've looked into proof of concept by Nikita Glukhov [0] and it looks very interesting.
> So, I've implemented yet another version of B-tree-like GiST build.
> It's main use case and benefits can be summarized with small example:
>
> postgres=# create table x as select point (random(),random()) from generate_series(1,3000000,1);
> SELECT 3000000
> Time: 5061,967 ms (00:05,062)
> postgres=# create index ON x using gist (point ) with (fast_build_sort_function=gist_point_sortsupport);
> CREATE INDEX
> Time: 6140,227 ms (00:06,140)
> postgres=# create index ON x using gist (point );
> CREATE INDEX
> Time: 32061,200 ms (00:32,061)
>
> As you can see, Z-order build is on order of magnitude faster. Select performance is roughly the same. Also, index is significantly smaller.

Cool! These experiments bring me to following thoughts. Can we not
only build, but also maintain GiST indexes in B-tree-like manner? If
we put Z-value together with MBR to the non-leaf keys, that should be
possible. Maintaining it in B-tree-like manner would have a lot of
advantages.
1) Binary search in non-leaf pages instead of probing each key is much faster.
2) Split algorithm is also simpler and faster.
3) We can refind existing index tuple in predictable time of O(log N).
And this doesn't depend on MBR overlapping degree. This is valuable
for future table AMs, which would have notion of deleting individual
index tuples (for instance, zheap promises to eventually support
delete-marking indexes).

Eventually, we may come to the idea of B-tree indexes with
user-defined additional keys in non-leaf tuples, which could be used
for scanning.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2019-08-29 23:33:48 Re: Yet another fast GiST build
Previous Message Peter Eisentraut 2019-08-29 18:52:21 Re: pg_upgrade: Error out on too many command-line arguments