Re: Questions about support function and abbreviate

From: Giuseppe Broccolo <g(dot)broccolo(dot)7(at)gmail(dot)com>
To: Han Wang <hanwgeek(at)gmail(dot)com>
Cc: Darafei Komяpa Praliaskouski <me(at)komzpa(dot)net>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Questions about support function and abbreviate
Date: 2021-06-12 23:45:45
Message-ID: CAFtuf8ATNLPSnF-A9g3sw+B_2MH=4GeCL-Lzw+YNTv=EnGy+Bw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Han,

Darafei already provided a good answer to your question, I will add just a
few things with the hope of making things more clear for your use case.

SortSupport implementation in PostgreSQL allows to make comparisons at
binary level in a dedicated region of memory where data can be quickly
accessed through
references to actual data in the heap called "sort tuples". Those
references have a space to include the data of a length of a native pointer
of a system, which is 8 bytes
for 64 bit systems. Although that represents enough space for standard data
types like integers or floats, it's not enough for longer data types, or
varlena data like
geometries.

In this last case, we need to pass to sort tuples an abbreviated version of
the key which should include the most representative part. This is the
scope of the abbreviated
attributes which need to be provided to create the abbreviated keys.

To answer more specifically to your question, the four abbreviated
attributes represent

* comparator --> the access method which should
be used of comparison of abbreviated keys
* abbrev_converter --> the method which creates the abbreviations
(NOTE in src/backend/access/gist/gistproc.c it just consider the first 32
bits of the hash of a geometry)
* abbrev_abort --> the method which should check if the
abbreviation has to be done or not even in cases the length is greater than
the size of the native pointer (NOTE,
it is not
implemented in src/backend/access/gist/gistproc.c, which means that
abbreviation is always worth)
* abbrev_full_comparator --> the method which should be used for
comparisons in case of fall back into not abbreviated keys (NOTE, this
attribute coincides to the comparator one
in case the
abbreviate flag is set to false)

Hope it helps,
Giuseppe.

Il giorno sab 12 giu 2021 alle ore 08:43 Han Wang <hanwgeek(at)gmail(dot)com> ha
scritto:

> Hi Darafei,
>
> Thanks for your reply.
>
> However, I still don't get the full picture of this. Let me make my
> question more clear.
>
> First of all, in the *`gistproc.c
> <https://github.com/postgres/postgres/blob/master/src/backend/access/gist/gistproc.c#L1761>`*
> of Postgres, it shows that the `abbreviate` attributes should be set before
> the `abbrev_converter` defined. So I would like to know where to define a
> `SortSupport` structure with `abbreviate` is `true`.
>
> Secondly, in the support functions of internal data type `Point`, the
> `abbrev_full_copmarator` just z-order hash the point first like the
> `abbrev_converter` doing and then compare the hash value. So I don't know
> the difference between `full_comparator` and `comparator` after
> `abbrev_converter`.
>
> Best regards,
> Han
>
> On Sat, Jun 12, 2021 at 2:55 PM Darafei "Komяpa" Praliaskouski <
> me(at)komzpa(dot)net> wrote:
>
>> Hello,
>>
>> the abbrev_converter is applied whenever it is defined. The values are
>> sorted using the abbreviated comparator first using the shortened version,
>> and if there is a tie the system asks the real full comparator to resolve
>> it.
>>
>> This article seems to be rather comprehensive:
>> https://brandur.org/sortsupport
>>
>> On Sat, Jun 12, 2021 at 9:51 AM Han Wang <hanwgeek(at)gmail(dot)com> wrote:
>>
>>> Hi all,
>>>
>>> I am trying to implement a sort support function for geometry data types
>>> in PostGIS with the new feature `SortSupport`. However, I have a question
>>> about this.
>>>
>>> I think it is hardly to apply a sort support function to a complex data
>>> type without the `abbrev_converter` to simply the data structure into a
>>> single `Datum`. However, I do not know how the system determines when to
>>> apply the converter.
>>>
>>> I appreciate any answers or suggestions. I am looking forward to hearing
>>> from you.
>>>
>>> Best regards,
>>> Han
>>>
>>
>>
>> --
>> Darafei "Komяpa" Praliaskouski
>> OSM BY Team - http://openstreetmap.by/
>>
>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Korotkov 2021-06-12 23:58:30 Re: unnesting multirange data types
Previous Message Jonathan S. Katz 2021-06-12 22:16:24 Re: unnesting multirange data types