Re: GiST "choose subtree" support function to inline penalty

From: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
To: Andrey Borodin <x4mmm(at)yandex-team(dot)ru>
Cc: Darafei Praliaskouski <me(at)komzpa(dot)net>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: GiST "choose subtree" support function to inline penalty
Date: 2019-06-26 01:20:06
Message-ID: CAPpHfdssv2i7CXTBfiyR6=_A3tp19+iLo-pkkfD6Guzs2-tvEw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jun 24, 2019 at 2:31 PM Andrey Borodin <x4mmm(at)yandex-team(dot)ru> wrote:

> > 24 июня 2019 г., в 15:08, Darafei Komяpa Praliaskouski <me(at)komzpa(dot)net>
> написал(а):
> >
> > I'm looking at PostGIS geometry GiST index build times and try to
> optimize withing the current GiST framework. The function that shows a lot
> on my flame graphs is penalty.
> >
> > I spent weekend rewriting PostGIS penalty to be as fast as possible.
> > (FYI https://github.com/postgis/postgis/pull/425/files)
> >
> > However I cannot get any meaningfully faster build time. Even when I
> strip it to "just return edge extension" index build time is the same.
> >
> > Is there a way to inline the penalty into above "choose subtree" loop
> somehow? That would also let us stop bit-fiddling floats to simulate a more
> complex choosing scheme.
>
> Maybe we could just add new opclass function for choosing subtree?
>

+1,
This sounds reasonable. Authors of existing GiST opclasses wouldn't have
trouble to keep compatible with new PostgreSQL versions.

I see one more use case for "choose subtree" instead "penalty". When
R*-tree chooses subtree, it considers to only extension of selected
bounding box, but also overlap increase of bounding boxes. This strategy
should have a positive effect of tree quality, besides I never seen it has
been measured separately. It probably kind of possible to implement using
"penalty" method assuming you have reference to the page in GISTENTRY. But
that doesn't seems a correct way to use the GiST interface. Additionally,
you don't know the attribute number to get the correct key in multicolumn
indexes. Having "choose subtree" method will make it possible to implement
this strategy in correct way. However, this use case is kind of opposite
to Darafei's one, because it should make choosing subtree slower (but
better).

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2019-06-26 02:22:36 Re: BUG #15858: could not stat file - over 4GB
Previous Message Tom Lane 2019-06-26 00:18:47 Re: UCT (Re: pgsql: Update time zone data files to tzdata release 2019a.)