Re: BRIN minmax multi - incorrect distance for infinite timestamp/date

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: BRIN minmax multi - incorrect distance for infinite timestamp/date
Date: 2023-10-16 14:03:12
Message-ID: dd7fcf77-414b-b4b2-358a-0fc505255fba@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 10/16/23 11:25, Ashutosh Bapat wrote:
> Thanks Tomas for bringing this discussion to hackers.
>
>
> On Fri, Oct 13, 2023 at 8:58 PM Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> wrote:
>>
>> On Fri, 13 Oct 2023 at 13:17, Tomas Vondra
>> <tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
>>>
>>> I do plan to backpatch this, yes. I don't think there are many people
>>> affected by this (few people are using infinite dates/timestamps, but
>>> maybe the overflow could be more common).
>>>
>
> The example you gave is missing CREATE INDEX command. Is it "create
> index test_idx_a on test using brin(a);"

Ah, you're right - apologies. FWIW when testing I usually use 1-page
ranges, to possibly hit more combinations / problems. More importantly,
it needs to specify the opclass, otherwise it'll use the default minmax
opclass which does not use distance at all:

create index test_idx_a on test
using brin (a timestamptz_minmax_multi_ops) with (pages_per_range=1);

>
> Do already create indexes have this issue? Do they need to rebuilt
> after upgrading?
>

Yes, existing indexes will have inefficient ranges. I'm not sure we want
to push people to reindex everything, the issue seem somewhat unlikely
in practice.

>>
>> OK, though I doubt that such values are common in practice.
>>
>> There's also an overflow problem in
>> brin_minmax_multi_distance_interval() though, and I think that's worse
>> because overflows there throw "interval out of range" errors, which
>> can prevent index creation or inserts.
>>
>> There's a patch (0009 in [1]) as part of the infinite interval work,
>> but that just uses interval_mi(), and so doesn't fix the
>> interval-out-of-range errors, except for infinite intervals, which are
>> treated as special cases, which I don't think is really necessary.
>>
>
> Right. I used interval_mi() to preserve the finite value behaviour as
> is. But ...
>
>> I think this should be rewritten to compute delta from ia and ib
>> without going via an intermediate Interval value. I.e., instead of
>> computing "result", just do something like
>>
>> dayfraction = (ib->time % USECS_PER_DAY) - (ia->time % USECS_PER_DAY);
>> days = (ib->time / USECS_PER_DAY) - (ia->time / USECS_PER_DAY);
>> days += (int64) ib->day - (int64) ia->day;
>> days += ((int64) ib->month - (int64) ia->month) * INT64CONST(30);
>>
>> then convert to double precision as it does now:
>>
>> delta = (double) days + dayfraction / (double) USECS_PER_DAY;
>>
>
> Given Tomas's explanation of how these functions are supposed to work,
> I think your suggestions is better.
>
> I was worried that above calculations may not produce the same result
> as the current code when there is no error because modulo and integer
> division are not distributive over subtraction. But it looks like
> together they behave as normal division which is distributive over
> subtraction. I couldn't find an example where that is not true.
>
> Tomas, you may want to incorporate this in your patch. If not, I will
> incorporate it in my infinite interval patchset in [1].
>

I'd rather keep it as separate patch, although maybe let's deal with it
separately from the larger patches. It's a bug, and having it in a patch
set that adds a feature does not seem like a good idea (or maybe I don't
understand what the other thread does, I haven't looked very closely).

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2023-10-16 14:19:57 Re: Rename backup_label to recovery_control
Previous Message Tom Lane 2023-10-16 13:54:41 Re: Add support for AT LOCAL