Re: Yet another fast GiST build

From: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: Andrey Borodin <x4mmm(at)yandex-team(dot)ru>, Darafei Komяpa Praliaskouski <me(at)komzpa(dot)net>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Yet another fast GiST build
Date: 2019-08-30 03:21:55
Message-ID: CAPpHfdvYhZsKWaE3M2QTWcFPJsKxFKTWFkp4+5VE9z9ug+M3jw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Aug 30, 2019 at 2:34 AM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> On Thu, Aug 29, 2019 at 3:48 PM Alexander Korotkov
> <a(dot)korotkov(at)postgrespro(dot)ru> wrote:
> > > As you can see, Z-order build is on order of magnitude faster. Select performance is roughly the same. Also, index is significantly smaller.
> >
> > Cool! These experiments bring me to following thoughts. Can we not
> > only build, but also maintain GiST indexes in B-tree-like manner? If
> > we put Z-value together with MBR to the non-leaf keys, that should be
> > possible. Maintaining it in B-tree-like manner would have a lot of
> > advantages.
>
> I'm not an expert on GiST, but that seems like it would have a lot of
> advantages in the long term. It is certainly theoretically appealing.
>
> Could this make it easier to use merge join with containment
> operators? I'm thinking of things like geospatial joins, which can
> generally only be performed as nested loop joins at the moment. This
> is often wildly inefficient.

AFAICS, spatial joins aren't going to be as easy as just merge joins
on Z-value. When searching for close points in two datasets (closer
than given threshold) we can scan some ranges of Z-value in one
dataset while iterating on another. But dealing with prolonged
spatial objects in not that easy. In order to determine if there are
matching values in given Z-range you also need to be aware on size of
objects inside that Z-range. So, before merge-like join you need to
not only sort, but build some index-like structure.

Alternatively you can encode size in Z-value. But this increases
dimensionality of space and decreases efficiency of join. Also,
spatial join can be made using two indexes, even just current GiST
without Z-values. We've prototyped that, see [1].

Links
1. https://github.com/pgsphere/pgsphere/blob/crossmatch_cnode/crossmatch.c

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2019-08-30 03:28:02 Re: Yet another fast GiST build
Previous Message Richard Guo 2019-08-30 03:15:37 Re: A problem about partitionwise join