Re: [PATCH] Integer overflow in timestamp[tz]_part() and date/time boundaries check

From: Vitaly Burovoy <vitaly(dot)burovoy(at)gmail(dot)com>
To: Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] Integer overflow in timestamp[tz]_part() and date/time boundaries check
Date: 2016-02-25 01:21:19
Message-ID: CAKOSWNnaRepjUi0LPbXWbV4rYq4OyHXi-w6Vb7S8qAPmTUrX6Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2/2/16, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com> wrote:
> On 2/2/16 6:39 PM, Tom Lane wrote:
>> I'm inclined to think that a good solution would be to create an
>> artificial restriction to not accept years beyond, say, 100000 AD.
>> That would leave us with a lot of daylight to not have to worry
>> about corner-case overflows in timestamp arithmetic. I'm not sure
>> though where we'd need to enforce such a restriction; certainly in
>> timestamp[tz]_in, but where else?
>
> Probably some of the casts (I'd think at least timestamp->timestamptz).
> Maybe timestamp[tz]_recv. Most of the time*pl* functions. :/

Please find attached a patch checks boundaries of date/timestamp[tz].
There are more functions: converting to/from timestamptz, truncating,
constructing from date and time etc.

I left the upper boundary as described[1] in the documentation
(294276-12-31 AD), lower - "as is" (4714-11-24 BC).
It is easy to change the lower boundary to 4713-01-01BC (as described
in the documentation) and it seems necessary because it allows to
simplify IS_VALID_JULIAN and IS_VALID_JULIAN4STAMPS and avoid the next
behavior:

postgres=# select
postgres-# to_char(date_trunc('week', '4713-01-01 BC'::date),'day')
postgres-# ,to_char(date_trunc('week', '4714-12-29 BC'::date),'day')
postgres-# ,to_char(date_trunc('week', '4714-12-28 BC'::date),'day');
to_char | to_char | to_char
-----------+-----------+-----------
monday | monday | thursday
(1 row)

since 4714-12-28 BC and to the past detection when a week is starting
is broken (because it is boundary of isoyears -4713 and -4712).
Is it worth to break undocumented range or leave it as is?

There is one more flaw: checking for a correctness begins from date
and if default TZ is not UTC, dump/restore of values of type
timestamptz which are close to allowed boundaries can be broken (and
such result can't be restored because date is not in allowed range):
postgres=# SET TIME ZONE 'GMT+1';
SET
postgres=# COPY (SELECT '4714-11-24 00:00:00.000000+00
BC'::timestamptz) TO STDOUT;
4714-11-23 23:00:00-01 BC

Also I'm asking for a help because the query (in default TZ='GMT+1'):
postgres=# SELECT '4714-11-24 00:00:00.000000+00 BC'::timestamptz;

in psql gives a result "4714-11-23 23:00:00-01 BC",
but in a testing system gives "Sun Nov 23 23:00:00 4714 GMT BC"
without TZ offset.

I don't see what can be added to the documentation with the applied patch.

More testings, finding bugs, uncovered functions, advice, comment
improvements are very appreciated.

[1]http://www.postgresql.org/docs/devel/static/datatype-datetime.html

--
Best regards,
Vitaly Burovoy

Attachment Content-Type Size
fix_timestamps_overflow.v001.patch application/octet-stream 112.4 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Vitaly Burovoy 2016-02-25 01:33:54 Re: [PATCH] Integer overflow in timestamp[tz]_part() and date/time boundaries check
Previous Message Thomas Munro 2016-02-25 00:57:35 Re: Random note of encouragement