Re: UCT (Re: pgsql: Update time zone data files to tzdata release 2019a.)

From: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Andres Freund <andres(at)anarazel(dot)de>, Christoph Berg <myon(at)debian(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: UCT (Re: pgsql: Update time zone data files to tzdata release 2019a.)
Date: 2019-06-28 00:33:58
Message-ID: 87pnmyr3vc.fsf@news-spur.riddles.org.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers pgsql-hackers

>>>>> "Tom" == Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:

> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> I'm kind of unsure what to think about this whole debate
>> substantively. If Andrew is correct that zone.tab or zone1970.tab is
>> a list of time zone names to be preferred over alternatives, then it
>> seems like we ought to prefer them.

Tom> It's not really clear to me that the IANA folk intend those files
Tom> to be read as a list of preferred zone names.

The files exist to support user selection of zone names. That is, it is
intended that you can use them to allow the user to choose their country
and then timezone within that country, rather than offering them a flat
regional list (which can be large and the choices non-obvious).

The zone*.tab files therefore include only geographic names, and not
either Posix-style abbreviations or special cases like Etc/UTC. Programs
that use zone*.tab to allow user selection handle cases like that
separately (for example, FreeBSD's tzsetup offers "UTC" at the
"regional" menu).

It's quite possible that people have implemented time zone selection
interfaces that use some other presentation of the list, but that
doesn't particularly diminish the value of zone*.tab. In particular, the
current zone1970.tab has:

- at least one entry for every iso3166 country code that's not an
uninhabited remote island;

- an entry for every distinct "Zone" in the primary data files, with
the exception of entries that are specifically commented as being
for backward compatibility (e.g. CET, CST6CDT, etc. - see the
comments in the europe and northamerica data files for why these
exist)

The zonefiles that get installed in addition to the ones in zone1970.tab
fall into these categories:

- they are "Link" entries in the primary data files

- they are from the "backward" data file, which is omitted in some
system tzdb installations because it exists only for backward
compatibility (but we install it because it's still listed in
tzdata.zi by default)

- they are from the "etcetera" file, which lists special cases such as
UTC and fixed UTC offsets

Tom> If they do, what are we to make of the fact that no variant of
Tom> "UTC" appears in them?

That "UTC" is not a geographic timezone name?

>> He remarks that we are preferring "deprecated backward-compatibility
>> aliases" and to the extent that this is true, it seems like a bad
>> thing. We can't claim to be altogether here apolitical, because when
>> those deprecated backward-compatibility names are altogether
>> removed, we are going to remove them and they're going to stop
>> working. If we know which ones are likely to suffer that fate
>> eventually, we ought to stop spitting them out. It's no more
>> political to de-prefer them when upstream does than it is to remove
>> them with the upstream does.

Tom> I think that predicting what IANA will do in the future is a
Tom> fool's errand.

Maybe so, but when something is explicitly in a file called "backward",
and the upstream-provided Makefile has specific options for omitting it
(even though it is included by default), and all the comments about it
are explicit about it being for backward compatibility, I think it's
reasonable to avoid _preferring_ the names in it.

The list of backward-compatibility zones is in any case extremely
arbitrary and nonsensical: for example "GB", "Eire", "Iceland",
"Poland", "Portugal" are aliases for their respective countries, but
there are no comparable aliases for any other European country. The
"Navajo" entry (an alias for America/Denver) has already been mentioned
in this thread; our arbitrary rule prefers it (due to shortness) for all
US zones that use Mountain time with DST. And so on.

Tom> Our contract is to select some one of the aliases that the tzdb
Tom> database presents, not to guess about whether it might present a
Tom> different set in the future. (Also note that a lot of the observed
Tom> variation here has to do with whether individual platforms choose
Tom> to install backward-compatibility zone names. I think the odds
Tom> that IANA proper will remove those links are near zero; TTBOMK
Tom> they never have removed one yet.)

Well, we should also consider the possibility that we might be using the
system tzdata and that the upstream OS or distro packager may choose to
remove the "backward" data or split it to a separate package.

Tom> More generally, my unhappiness about Andrew's proposal is:

[...]
Tom> 3. The proposal has technical issues, in particular I'm not nearly
Tom> as sanguine as Andrew is about whether we can rely on
Tom> zone[1970].tab to be available.

My proposal works even if it's not, though I don't expect that to be an
issue in practice.

--
Andrew (irc:RhodiumToad)

In response to

Browse pgsql-committers by date

  From Date Subject
Next Message Thomas Munro 2019-06-28 05:24:35 pgsql: Fix misleading comment in nodeIndexonlyscan.c.
Previous Message Tom Lane 2019-06-27 17:58:04 Re: UCT (Re: pgsql: Update time zone data files to tzdata release 2019a.)

Browse pgsql-hackers by date

  From Date Subject
Next Message Adam Berlin 2019-06-28 00:48:21 C testing for Postgres
Previous Message Thomas Munro 2019-06-27 23:32:45 Re: An out-of-date comment in nodeIndexonlyscan.c