Re: Fix overflow in DecodeInterval

From: Joseph Koshakow <koshy44(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Fix overflow in DecodeInterval
Date: 2022-02-19 19:26:17
Message-ID: CAAvxfHduBhZdUXLFwLOSGW=0c17frLojcbCqNsZrw5n73pekZg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Copying over a related thread here to have info all in one place.
It's a little fragmented so sorry about that.

On Fri, Feb 18, 2022 at 11:44 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> Joseph Koshakow <koshy44(at)gmail(dot)com> writes:
> > When formatting the output of an Interval, we call abs() on the hours
> > field of the Interval. Calling abs(INT_MIN) returns back INT_MIN
> > causing the output to contain two '-' characters. The attached patch
> > fixes that issue by special casing INT_MIN hours.
>
> Good catch, but it looks to me like three out of the four formats in
> EncodeInterval have variants of this problem --- there are assumptions
> throughout that code that we can compute "-x" or "abs(x)" without
> fear. Not much point in fixing only one symptom.
>
> Also, I notice that there's an overflow hazard upstream of here,
> in interval2tm:
>
> regression=# select interval '214748364 hours' * 11;
> ERROR: interval out of range
> regression=# \errverbose
> ERROR: 22008: interval out of range
> LOCATION: interval2tm, timestamp.c:1982
>
> There's no good excuse for not being able to print a value that
> we computed successfully.
>
> I wonder if the most reasonable fix would be to start using int64
> instead of int arithmetic for the values that are potentially large.
> I doubt that we'd be taking much of a performance hit on modern
> hardware.
>
> regards, tom lane

Joseph Koshakow <koshy44(at)gmail(dot)com> writes:
> On Fri, Feb 18, 2022 at 11:44 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> I wonder if the most reasonable fix would be to start using int64
>> instead of int arithmetic for the values that are potentially large.
>> I doubt that we'd be taking much of a performance hit on modern
>> hardware.

> That's an interesting idea. I've always assumed that the range of the
> time fields of Intervals was 2147483647 hours 59 minutes
> 59.999999 seconds to -2147483648 hours -59 minutes
> -59.999999 seconds. However the only reason that we can't support
> the full range of int64 microseconds is because the struct pg_tm fields
> are only ints. If we increase those fields to int64 then we'd be able to
> support the full int64 range for microseconds as well as implicitly fix
> some of the overflow issues in DecodeInterval and EncodeInterval.

On Sat, Feb 19, 2022 at 1:52 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> Joseph Koshakow <koshy44(at)gmail(dot)com> writes:
> > On Sat, Feb 19, 2022 at 12:00 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >> I think that messing with struct pg_tm might have too many side-effects.
> >> However, pg_tm isn't all that well adapted to intervals in the first
> >> place, so it'd make sense to make a new struct specifically for interval
> >> decoding.
>
> > Yeah that's a good idea, pg_tm is used all over the place. Is there a
> > reason we need an intermediate structure to convert to and from?
> > We could just convert directly to an Interval in DecodeInterval and
> > print directly from an Interval in EncodeInterval.
>
> Yeah, I thought about that too, but if you look at the other callers of
> interval2tm, you'll see this same set of issues. I'm inclined to keep
> the current code structure and just fix the data structure. Although
> interval2tm isn't *that* big, I don't think four copies of its logic
> would be an improvement.
>
> > Also I originally created a separate thread and patch because I
> > thought this wouldn't be directly related to my other patch,
> > https://commitfest.postgresql.org/37/3541/. However I think with these
> > discussed changes it's directly related. So I think it's best to close
> > this thread and patch and copy this conversion to the other thread.
>
> Agreed.
>
> regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2022-02-19 19:39:04 Re: Reducing power consumption on idle servers
Previous Message Daniel Gustafsson 2022-02-19 19:24:09 Re: doc: join state for merge join