Re: Redhat 7.3 time manipulation bug

From: cbbrowne(at)cbbrowne(dot)com
To: PostgreSQL Hackers List <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Redhat 7.3 time manipulation bug
Date: 2002-05-25 00:37:24
Message-ID: 20020525003724.B11FD35B0F@cbbrowne.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> > > The last phase could be extending the API to allow multiple simultaneous
> > > time zones, detection of bad time zones, etc etc. This would involve API
> > > changes or extensions, and breaks compatibility with system-supplied
> > > infrastructure.
> > One thing that wasn't clear to me, but could use investigation: if so
> > many systems are using the same underlying timezone database info, maybe
> > there is some commonality at a level below the ISO mktime/tzset/etc API.
> > If we could make use of the system-provided TZ database at a lower level
> > while still using our own APIs not tied to time_t, it'd answer the issue
> > of compatibility with the surrounding system. (Which is a real issue,
> > I agree --- we should be able to accept the system's standard TZ setting
> > if possible.)

> The fundamental problem (which of course can have a fundamental
> solution ;) is that a time zone database built with a 32-bit time_t
> will have time zone info through 2038 only (it is a binary file with
> 32-bit time fields -- almost certainly anyway). So if we have an
> extended time zone infrastructure using something different for time_t
> we would need to handle the case of reading non-extended time zones
> databases, which puts us back to having limitations.

Ah, but the database in question _doesn't_ consist of 32 bit time_t
values.

It consists of things like:

# @(#)zone.tab 1.26
#
# TZ zone descriptions
#
# From Paul Eggert <eggert(at)twinsun(dot)com> (1996-08-05):
#
# This file contains a table with the following columns:
# 1. ISO 3166 2-character country code. See the file `iso3166.tab'.
# 2. Latitude and longitude of the zone's principal location
# in ISO 6709 sign-degrees-minutes-seconds format,
# either +-DDMM+-DDDMM or +-DDMMSS+-DDDMMSS,
# first latitude (+ is north), then longitude (+ is east).
# 3. Zone name used in value of TZ environment variable.
# 4. Comments; present if and only if the country has multiple rows.
#
# Columns are separated by a single tab.
# The table is sorted first by country, then an order within the country that
# (1) makes some geographical sense, and
# (2) puts the most populous zones first, where that does not contradict (1).
#
# Lines beginning with `#' are comments.
#
#country-
#code coordinates TZ comments
AD +4230+00131 Europe/Andorra
AE +2518+05518 Asia/Dubai
AF +3431+06912 Asia/Kabul
AG +1703-06148 America/Antigua
AI +1812-06304 America/Anguilla
AL +4120+01950 Europe/Tirane
AM +4011+04430 Asia/Yerevan
AN +1211-06900 America/Curacao
AO -0848+01314 Africa/Luanda

Then a "leapseconds" table, looking like:
# The correction (+ or -) is made at the given time, so lines
# will typically look like:
# Leap YEAR MON DAY 23:59:60 + R/S
# or
# Leap YEAR MON DAY 23:59:59 - R/S

# If the leapsecond is Rolling (R) the given time is local time
# If the leapsecond is Stationary (S) the given time is UTC

# Leap YEAR MONTH DAY HH:MM:SS CORR R/S
Leap 1972 Jun 30 23:59:60 + S
Leap 1972 Dec 31 23:59:60 + S
Leap 1973 Dec 31 23:59:60 + S
Leap 1974 Dec 31 23:59:60 + S
Leap 1975 Dec 31 23:59:60 + S
Leap 1976 Dec 31 23:59:60 + S

And then a set of rules about timezone adjustments for all sorts of
localities, including the following:

# Rule NAME FROM TO TYPE IN ON AT SAVE LETTER/S
# Summer Time Act, 1916
Rule GB-Eire 1916 only - May 21 2:00s 1:00 BST
Rule GB-Eire 1916 only - Oct 1 2:00s 0 GMT
# S.R.&O. 1917, No. 358
Rule GB-Eire 1917 only - Apr 8 2:00s 1:00 BST
Rule GB-Eire 1917 only - Sep 17 2:00s 0 GMT

# Zone NAME GMTOFF RULES FORMAT [UNTIL]
Zone Antarctica/Casey 0 - zzz 1969
8:00 - WST # Western (Aus) Standard Time
Zone Antarctica/Davis 0 - zzz 1957 Jan 13
7:00 - DAVT 1964 Nov # Davis Time
0 - zzz 1969 Feb
7:00 - DAVT
Zone Antarctica/Mawson 0 - zzz 1954 Feb 13
6:00 - MAWT # Mawson Time

> I'm guessing that a better approach might be to have our time zone
> stuff inside our own API, which then could choose to call, for
> example, mktime() or pg_mktime(), which could each have different
> signatures. Then the heuristics for matching one to the other are
> isolated to our thin API implementation, not to the underlying system-
> or pg-provided libraries.

> matching "stringy time zones" to numeric offsets for input date/times.
> The time zone databases themselves don't lend themselves to this,
> since the tables have those stringy zones somewhere on the right hand
> side of each row of information and the fields can change from year to
> year.

The ultimate goal would seem likely to be to store dates internally in
some form like UTC, with some reasonably huge dynamic range, that is,
not limited to 32 bit timestamps, but rather using something like a
proleptic Gregorian calendar (per _Calendrical Calculations_, page 50).

Some reasonable treatments would include:

- 32 bits is an signed int indicating number of days since GREG_EPOCH,
where logical epochs would include January 1, 1, January 1, 1900, or
perhaps even something actually proleptic (proleptic indicates
"future"), such as January 1, 2038.

- 8 bits indicating the month; 8 bits indicating the day of month;
16 bits providing a range of years from -32767 to 32768.

Both have merits...

Timestamps would then forcibly expand things by _at least_ 22 bits, the
minimum needed to express 1/100ths of seconds. Might as well head on to
32 bits for the time and so have something that can easily represent
values down to well below a millisecond.

The "stringy stuff" indicates how values are to be displayed or parsed.
It does nothing about what is stored internally, or at least shouldn't.
--
(reverse (concatenate 'string "gro.gultn@" "enworbbc"))
http://www.cbbrowne.com/info/emacs.html
In the name of the Lord-High mutant, we sacrifice this suburban girl
-- `Future Schlock'

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Nigel J. Andrews 2002-05-25 01:55:58 Re: Redhat 7.3 time manipulation bug
Previous Message Tom Lane 2002-05-25 00:09:49 Re: Redhat 7.3 time manipulation bug