Re: postgresql 7.3.2 bug on date '1901-12-13' and '1901-12

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Stephan Szabo <sszabo(at)megazone(dot)bigpanda(dot)com>
Cc: Arnold Mavromatis <A(dot)Mavromatis(at)bom(dot)gov(dot)au>, pgsql-bugs(at)postgresql(dot)org, pgsql-hackers(at)postgresql(dot)org
Subject: Re: postgresql 7.3.2 bug on date '1901-12-13' and '1901-12
Date: 2003-08-25 20:48:33
Message-ID: 10355.1061844513@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

Stephan Szabo <sszabo(at)megazone(dot)bigpanda(dot)com> writes:
> On Thu, 21 Aug 2003, Tom Lane wrote:
>> Stephan Szabo <sszabo(at)megazone(dot)bigpanda(dot)com> writes:
>>> Wait, he's in australia, what if he's getting the edge case the other way.
>>
>> I'm inclined to fix to_date by decomposing the code differently ---
>> it should avoid the coercion to timestamp, which is a waste of cycles
>> anyway. But is to_timestamp (and more generally timestamp's input
>> converter) broken? If so, how can we do better? I don't think we can
>> entirely avoid the problem of a transition between local and GMT time.

> Yes. Timestamp with timezone is broken on the same boundaries in general.
> I'm not really sure how to do better without some work, it seems we end up
> with multiple different input values getting the same internal
> representation so we can differentiate which version of the input was used
> to get there (whether the user said 1901-12-13 23:00 or 1901-12-14).

I've fixed to_date() along the above lines, but the general problem of
how timestamp I/O should behave remains.

I've come to the conclusion that there isn't any really consistent
behavior if we want to stick with the current definition that
"timestamps outside the Unix date range are always UTC". If we do that,
then there is a set of timestamps at one end of the date range that are
ambiguous (they could be taken as either UTC or local), while at the
other end of the range there is a set of timestamps that can't be
validly converted as either one. This is essentially the same problem
we have during daylight-savings transition hours: when you "spring
forward" there is no local time 02:30, and when you "fall back" there
are two of 'em.

The solution we've adopted for DST transitions is to interpret invalid
or ambiguous local times as "always standard time". We could possibly
do the same for the questionable times at the ends of the Unix date
range, ie, always interpret them as UTC (although I've been fooling with
the code for a couple hours now trying to get it to do that, without
much success).

Plan B would be to get rid of the discontinuity by abandoning the rule
that timestamps outside the Unix range are UTC. We could instead say
that the local time zone offset that mktime() reports for the first date
of the Unix range applies to all prior dates, and similarly the offset
for the last date of the range applies to all later dates.

I'm unsure which of these is a better answer. Any thoughts?

regards, tom lane

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Brian J. Erickson 2003-08-25 22:12:30 ODBC, SQLExecute and HY010
Previous Message Stephan Szabo 2003-08-25 15:52:34 Re: index not used afer VACUUM ANALYZE

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2003-08-25 21:04:28 Re: NOTICE vs WARNING
Previous Message Franco Bruno Borghesi 2003-08-25 19:42:53 Re: Buglist