Re: PATCH: Using BRIN indexes for sorted output

From: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Greg Stark <stark(at)mit(dot)edu>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: PATCH: Using BRIN indexes for sorted output
Date: 2023-07-11 11:20:11
Message-ID: CAEze2Wg00Pry5KsNEqJps-uX+T6f=YTm-jyLZPt=yDNO5+3v0Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, 10 Jul 2023 at 22:04, Tomas Vondra
<tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
> On 7/10/23 18:18, Matthias van de Meent wrote:
>> On Mon, 10 Jul 2023 at 17:09, Tomas Vondra
>> <tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
>>> On 7/10/23 14:38, Matthias van de Meent wrote:
>>>>> I haven't really thought about geometric types, just about minmax and
>>>>> minmax-multi. It's not clear to me what the benefit for these types be.
>>>>> I mean, we can probably sort points lexicographically, but is anyone
>>>>> doing that in queries? It seems useless for order by distance.
>>>>
>>>> Yes, that's why you would sort them by distance, where the distance is
>>>> generated by the opclass as min/max distance between the summary and
>>>> the distance's origin, and then inserted into the tuplesort.
>>>>
>>>
>>> OK, so the query says "order by distance from point X" and we calculate
>>> the min/max distance of values in a given page range.
>>
>> Yes, and because it's BRIN that's an approximation, which should
>> generally be fine.
>>
>
> Approximation in what sense? My understanding was we'd get a range of
> distances that we know covers all rows in that range. So the results
> should be accurate, no?

The distance is going to be accurate only to the degree that the
summary can produce accurate distances for the datapoints it
represents. That can be quite imprecise due to the nature of the
contained datapoints: a summary of the points (-1, -1) and (1, 1) will
have a minimum distance of 0 to the origin, where the summary (-1, 0)
and (-1, 0.5) would have a much more accurate distance of 1. The point
I was making is that the summary can only approximate the distance,
and that approximation is fine w.r.t. the BRIN sort algoritm.

Kind regards,

Matthias van de Meent
Neon (https://neon.tech/)

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Aleksander Alekseev 2023-07-11 11:52:28 Re: SLRUs in the main buffer pool, redux
Previous Message Zhijie Hou (Fujitsu) 2023-07-11 11:01:20 RE: Support logical replication of DDLs