INTERVAL overflow detection is terribly broken

From: Rok Kralj <rok(dot)kralj(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: INTERVAL overflow detection is terribly broken
Date: 2013-06-23 13:00:59
Message-ID: CAMWF=HS++N9-NKsh-o5QSymvp0Np-VB0GdWwAt4uie1h8ZdNTQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi, after studying ITERVAL and having a long chat with RhoidumToad and
StuckMojo on #postgresql, I am presenting you 3 bugs regarding INTERVAL.

As far as I understand, the Interval struct (binary internal
representation) consists of:

int32 months
int32 days
int64 microseconds

1. OUTPUT ERRORS: Since 2^63 microseconds equals 2,562,047,788 > 2^31
hours, the overflow in pg_tm when displaying the value causes overflow. The
value of Interval struct is actually correct, error happens only on
displaying it.

SELECT INTERVAL '2147483647 hours' + INTERVAL '5 hours'
"-2147483644:00:00"

Even wireder:

SELECT INTERVAL '2147483647 hours' + '1 hour'
"--2147483648:00:00"

notice the double minus? Don't ask how I came across this two bugs.

2. OPERATION ERRORS: When summing two intervals, the user is not notified
when overflow occurs:

SELECT INT '2147483647' + INT '1'
ERROR: integer out of range

SELECT INTERVAL '2147483647 days' + INTERVAL '1 day'
"-2147483648 days"

This should be analogous.

3. PARSER / INPUT ERRORS:

This is perhaps the hardest one to explain, since this is an architectural
flaw. You are checking the overflows when parsing string -> pg_tm struct.
However, at this point, the parser doesn't know, that weeks and days are
going to get combined, or years are going to get converted to months, for
example.

Unawarness of underlying Interval struct causes two types of suberrors:

a) False positive

SELECT INTERVAL '2147483648 microseconds'
ERROR: interval field value out of range: "2147483648 microseconds"

This is not right. Microseconds are internally stored as 64 bit signed
integer. The catch is: this amount of microseconds is representable in
Interval data structure, this shouldn't be an error.

b) False negative

SELECT INTERVAL '1000000000 years'
"-73741824 years"

We don't catch errors like this, because parser only checks for overflow in
pg_tm. If overflow laters happens in Interval, we don't seem to care.

4. POSSIBLE SOLUTIONS:

a) Move the overflow checking just after the conversion of pg_tm ->
Interval is made. This way, you can accurately predict if the result is
really not store-able.

b) Because of 1), you have to declare tm_hour as int64, if you want to use
that for the output. But, why not use Interval struct for printing
directly, without intermediate pg_tm?

5. Offtopic: Correct the documentation, INTERVAL data size is 16 bytes, not
12.

Rok Kralj

--
eMail: rok(dot)kralj(at)gmail(dot)com

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephen Frost 2013-06-23 13:40:55 Re: A better way than tweaking NTUP_PER_BUCKET
Previous Message Simon Riggs 2013-06-23 09:45:09 Re: A better way than tweaking NTUP_PER_BUCKET