Re: BRIN minmax multi - incorrect distance for infinite timestamp/date

From: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
To: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
Cc: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: BRIN minmax multi - incorrect distance for infinite timestamp/date
Date: 2023-10-16 09:25:24
Message-ID: CAExHW5tsrfu3k5We1tr_ANaQ_0OJc6Zf2YMB+TkmB2PQwirAPQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thanks Tomas for bringing this discussion to hackers.

On Fri, Oct 13, 2023 at 8:58 PM Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> wrote:
>
> On Fri, 13 Oct 2023 at 13:17, Tomas Vondra
> <tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
> >
> > I do plan to backpatch this, yes. I don't think there are many people
> > affected by this (few people are using infinite dates/timestamps, but
> > maybe the overflow could be more common).
> >

The example you gave is missing CREATE INDEX command. Is it "create
index test_idx_a on test using brin(a);"

Do already create indexes have this issue? Do they need to rebuilt
after upgrading?

>
> OK, though I doubt that such values are common in practice.
>
> There's also an overflow problem in
> brin_minmax_multi_distance_interval() though, and I think that's worse
> because overflows there throw "interval out of range" errors, which
> can prevent index creation or inserts.
>
> There's a patch (0009 in [1]) as part of the infinite interval work,
> but that just uses interval_mi(), and so doesn't fix the
> interval-out-of-range errors, except for infinite intervals, which are
> treated as special cases, which I don't think is really necessary.
>

Right. I used interval_mi() to preserve the finite value behaviour as
is. But ...

> I think this should be rewritten to compute delta from ia and ib
> without going via an intermediate Interval value. I.e., instead of
> computing "result", just do something like
>
> dayfraction = (ib->time % USECS_PER_DAY) - (ia->time % USECS_PER_DAY);
> days = (ib->time / USECS_PER_DAY) - (ia->time / USECS_PER_DAY);
> days += (int64) ib->day - (int64) ia->day;
> days += ((int64) ib->month - (int64) ia->month) * INT64CONST(30);
>
> then convert to double precision as it does now:
>
> delta = (double) days + dayfraction / (double) USECS_PER_DAY;
>

Given Tomas's explanation of how these functions are supposed to work,
I think your suggestions is better.

I was worried that above calculations may not produce the same result
as the current code when there is no error because modulo and integer
division are not distributive over subtraction. But it looks like
together they behave as normal division which is distributive over
subtraction. I couldn't find an example where that is not true.

Tomas, you may want to incorporate this in your patch. If not, I will
incorporate it in my infinite interval patchset in [1].

[1] https://www.postgresql.org/message-id/CAExHW5u1JE7dxK=WLzqhCszNToxQzJdieRmhREpW6r8w6kcRGQ@mail.gmail.com

--
Best Wishes,
Ashutosh Bapat

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Bapat 2023-10-16 09:46:28 Re: RFC: Logging plan of the running query
Previous Message Amit Kapila 2023-10-16 09:13:57 Re: [PoC] pg_upgrade: allow to upgrade publisher node