Re: [PATCH] Integer overflow in timestamp[tz]_part() and date/time boundaries check

From: Vitaly Burovoy <vitaly(dot)burovoy(at)gmail(dot)com>
To: Mark Dilger <hornschnorter(at)gmail(dot)com>
Cc: David Steele <david(at)pgmasters(dot)net>, Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] Integer overflow in timestamp[tz]_part() and date/time boundaries check
Date: 2016-03-16 21:47:52
Message-ID: CAKOSWNmraFH4jjFKd5TKBdUoBBo_G7pT9cE9Y-n3XCK=8e1M=Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2016-03-15, Mark Dilger <hornschnorter(at)gmail(dot)com> wrote:
>
>> On Mar 14, 2016, at 5:12 PM, Vitaly Burovoy <vitaly(dot)burovoy(at)gmail(dot)com>
>> wrote:
>>
>> On 3/14/16, Mark Dilger <hornschnorter(at)gmail(dot)com> wrote:
>>> The first thing I notice about this patch is that
>>> src/include/datatype/timestamp.h
>>> has some #defines that are brittle. The #defines have comments
>>> explaining
>>> their logic, but I'd rather embed that in the #define directly:
>>>
>>> This:
>>>
>>> +#ifdef HAVE_INT64_TIMESTAMP
>>> +#define MIN_TIMESTAMP INT64CONST(-211813488000000000)
>>> +/* == (0 - POSTGRES_EPOCH_JDATE) * 86400 * USECS_PER_SEC */
>>> +#define MAX_TIMESTAMP INT64CONST(9223371331200000000)
>>> +/* == (JULIAN_MAX4STAMPS - POSTGRES_EPOCH_JDATE) * 86400 *
>>> USECS_PER_SEC
>>> */
>>> +#else
>>> +#define MIN_TIMESTAMP -211813488000.0
>>> +/* == (0 - POSTGRES_EPOCH_JDATE) * 86400 */
>>> +#define MAX_TIMESTAMP 9223371331200.0
>>> +/* == (JULIAN_MAX4STAMPS - POSTGRES_EPOCH_JDATE) * 86400 */
>>> +#endif
>>>
>>> Could be written as:
>>>
>>> #ifdef HAVE_INT64_TIMESTAMP
>>> #define MIN_TIMESTAMP
>>> ((INT64CONST(0) - POSTGRES_EPOCH_JDATE) * USECS_PER_DAY)
>>> #define MAX_TIMESTAMP
>>> ((JULIAN_MAX4STAMPS - POSTGRES_EPOCH_JDATE) * USECS_PER_DAY)
>>> #else
>>> #define MIN_TIMESTAMP
>>> ((INT64CONST(0) - POSTGRES_EPOCH_JDATE) * SECS_PER_DAY)
>>> #define MAX_TIMESTAMP
>>> ((JULIAN_MAX4STAMPS - POSTGRES_EPOCH_JDATE) * SECS_PER_DAY)
>>> #endif
>>>
>>> I assume modern compilers would convert these to the same constants at
>>> compile-time,
>>
>> Firstly, Postgres is compiling not only by modern compilers.
>
> Do you have an example of a compiler that will not do this constant folding
> at compile time?

No, I'm not good at knowing features of all versions and all kings of
compilers, but I'm sure constants are better than expressions for big
values. =)

>>> rather than impose a runtime penalty.
>>
>> Secondary, It is hard to write it correctly obeying Postgres' coding
>> standard (e.g. 80-columns border) and making it clear: it is not so
>> visual difference between USECS_PER_DAY and SECS_PER_DAY in the
>> expressions above (but it is vivid in comments in the file).
>
> Hmm. I think using USECS_PER_DAY is perfectly clear, but that is a
> personal
> opinion. I don't see any way to argue if you don't see it that way.

I'm talking about perception of the constants when they a very similar
but they are not justified by a single column (where difference
_in_expressions_ are clear). There was a difference by a single char
("U") only which is not so obvious without deep looking on it (be
honest I'd missed it until started to write an answer).

>> or "Why is INT64CONST set for
>> MIN_TIMESTAMP, but not for MAX_TIMESTAMP?" (JULIAN_MAX4STAMPS is _not_
>> int64).
>
> I was only casting the zero to int64. That doesn't seem necessary, so it
> can
> be removed. Both MIN_TIMESTAMP and MAX_TIMESTAMP were defined
> in terms of USECS_PER_DAY, which itself is defined in terms of INT64CONST,
> so I believe they both work out to be an int64 constant.

I hope so. But in such cases I'd prefer to _begin_ calculations from
int64, not to _finish_ by it.
It is impossible to pass constants (like JULIAN_MAX4STAMPS) to
INT64CONST macros. Inserting "(int64)" makes rows larger by 7 chars...
;-)

>>> The #defines would be less brittle in
>>> the event, for example, that the postgres epoch were ever changed.
>>
>> I don't think it is real, and even in such case all constants are
>> collected together in the file and will be found and changed at once.
>
> I agree that they would be found at once. I disagree that the example
> is not real, as I have changed the postgres epoch myself in some builds,
> to be able to use int32 timestamps on small devices.

Wow! I'm sorry, I didn't know about it.
But in such case (tighten to int32) there are more than two places
which should be changed. Two more (four with disabled
HAVE_INT64_TIMESTAMP) constants are not big deal with it.

--
Best regards,
Vitaly Burovoy

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Mark Dilger 2016-03-16 22:01:51 Re: [PATCH] Integer overflow in timestamp[tz]_part() and date/time boundaries check
Previous Message Jim Nasby 2016-03-16 21:41:13 Minor typos in optimizer/README