Re: Infinite Interval

From: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
To: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
Cc: jian he <jian(dot)universality(at)gmail(dot)com>, Joseph Koshakow <koshy44(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Gregory Stark (as CFM)" <stark(dot)cfm(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Infinite Interval
Date: 2023-09-29 07:12:56
Message-ID: CAEZATCVRTR3ip-2LQiNgPP3v+PKC_kFFRNcxowPbs9uCrpzBzg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, 22 Sept 2023 at 09:09, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> wrote:
>
> On Fri, 22 Sept 2023 at 08:49, Ashutosh Bapat
> <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> wrote:
> >
> > Rest of the patches are
> > same as previous set.
>
> OK, I'll take a look.
>

I've been going over this in more detail, and I'm attaching my review
comments as an incremental patch to make it easier to see the changes.

Aside from some cosmetic stuff, I've made the following more
substantive changes:

1. I don't think it's really worth adding the
pg_mul_add_sNN_overflow() functions to int.h.

I thought about this for a while, and it looks to me as though they're
really only of very limited general use, and even this patch only used
them in a couple of places.

Looking around more widely, the majority of places that multiply then
add actually require a 4-argument function that computes result = a *
b + c. But even then, such functions would offer no performance
benefit, and wouldn't really do much to improve code readability at
call sites.

And there's another issue: someone using these functions might
reasonably expect them to return true if the result overflows, but
actually, as written, they also return true if the intermediate
product overflows, which isn't necessarily what's expected / wanted.

So I think it's best to just drop these functions, getting rid of
0001, and rewriting 0002 to just use the existing int.h functions.
After a little more copy-editing, I think that actually makes the code
in make_interval() more readable.

I think that part is now ready to commit, and I plan to push this fix
to make_interval() separately, since it's really a bug-fix, not
related to support for infinite intervals. In line with recent
precedent, I don't think it's worth back-patching though, since such
inputs are pretty unlikely in production.

2. The various in_range() functions needed adjusting to handle
infinite interval offsets.

For timestamp values, I followed the precedent set by the equivalent
float/numeric code. I.e., all (finite and non-finite) timestamps are
regarded as infinitely following -infinity and infinitely preceding
+infinity.

For time values, it's a bit different because no time values precede
or follow any other by more than 24 hours, so a window frame between
+inf following and +inf following is empty (whereas in the timestamp
case it contains +inf). Put another way, such a window frame is empty
because a time value can't be infinity.

3. I got rid of interval2timestamp_no_overflow() because I don't think
it really makes much sense to convert an interval to a timestamp, and
it's a bit of a hack anyway (as selfuncs.c itself admits). Actually, I
think it's OK to just leave selfuncs.c as it is. The existing code
will cope just fine with infinite intervals, since they aren't really
infinite, just larger than any others.

4. I tested pg_upgrade on a table with an interval with INT_MAX
months, and it was silently converted to infinity. I think that's
probably the best outcome (better than failing). However, this means
that we really should require all 3 fields of an interval to be
INT_MIN/MAX for it to be considered infinite, otherwise it would be
possible to have multiple internal representations of infinity that do
not compare as equal.

Similarly, interval_in() needs to accept such inputs, otherwise things
like pg_dump/restore from pre-17 databases could fail. But since it
now requires all 3 fields of the interval to be INT_MIN/MAX for it to
be infinite, the odds of that happening by accident are vanishingly
small in practice.

This approach also means that the range of allowed finite intervals is
only reduced by 1 microsecond at each end of the range, rather than a
whole month.

Also, it means that it is no longer necessary to change a number of
the regression tests (such as the justify_interval() tests) for values
near INT_MIN/MAX.

Overall, I think this is now pretty close to being ready for commit.

Regards,
Dean

Attachment Content-Type Size
v25-0001-Check-for-overflow-in-make_interval.patch text/x-patch 5.8 KB
v25-0005-Use-integer-overflow-checking-routines-to-add-an.patch text/x-patch 15.7 KB
v25-0003-Introduce-infinity-interval-specification-in-Dec.patch text/x-patch 5.1 KB
v25-0004-Support-infinite-interval-values-in-sum-and-avg-.patch text/x-patch 25.7 KB
v25-0002-Add-infinite-interval-values.patch text/x-patch 96.9 KB
v25-0006-Implement-serialization-functions-for-interval-a.patch text/x-patch 5.7 KB
v25-0007-Code-review-of-infinite-interval-support.patch text/x-patch 102.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2023-09-29 07:27:52 Re: Add support for AT LOCAL
Previous Message Alexander Korotkov 2023-09-29 06:35:24 Re: Index range search optimization