Re: Patch: shouldn't timezone(text, timestamp[tz]) be STABLE?

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Aleksander Alekseev <aleksander(at)timescale(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Patch: shouldn't timezone(text, timestamp[tz]) be STABLE?
Date: 2021-09-05 18:57:33
Message-ID: 2944835.1630868253@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Aleksander Alekseev <aleksander(at)timescale(dot)com> writes:
>>> [ timetz_zone is VOLATILE ]
>> I wasn't excited enough about it personally to change it, and I'm
>> still not --- but if you want to, send a patch.

> Here is the patch.

I looked at this patch, and felt unhappy about the fact that it left
timetz_zone() still depending on pg_time_t and pg_localtime, which
nothing else in date.c does. Poking at it closer, I realized that
the DYNTZ code path is actually completely broken, and has been
for years. Observe:

regression=# select timezone('America/Santiago', '12:34 -02'::timetz);
timezone
-------------
11:34:00-03
(1 row)

That's fine. But CLT, which should be entirely equivalent
to America/Santiago, produces seeming garbage:

regression=# select timezone('CLT', '12:34 -02'::timetz);
timezone
-------------------
09:51:14-04:42:46
(1 row)

<digression>
What's happening there is that pg_localtime produces a struct tm
containing POSIX-style values, in particular tm_year is relative
to 1900. But DetermineTimeZoneAbbrevOffset expects a struct using
the PG convention that tm_year is relative to "AD 0". So it sees
a date in the second century AD, decides that that's way out of
range, and falls back to the "LMT" offset provided by the tzdb
database. That lines up with what you'd get from

regression=# set timezone = 'America/Santiago';
SET
regression=# select '0121-09-03 12:34'::timestamptz;
timestamptz
------------------------------
0121-09-03 12:34:00-04:42:46
(1 row)

</digression>

Basically the problem here is that this is incredibly hoary code
that's never been touched or tested as we revised datetime-related
APIs elsewhere. I'm fairly unhappy now that we don't have any
regression test coverage for this function. However, I see no
very good way to make that happen, because the interesting code
paths will (by definition) produce different results at different
times of year. I suppose we could carry two variant expected-files,
but ick. The DYNTZ path is particularly problematic here, because
that's only used for timezones that have changed definitions over
time, meaning they're particularly likely to change again.

Anyway, attached is a revised patch that gets rid of the antique
code, and it produces correct results AFAICT.

BTW, it's customary to *not* include catversion bumps in submitted
patches, because that accomplishes little except to ensure that
your patch will soon fail to apply. (This one already is failing.)
If you feel a need to remind the committer that a catversion bump
is needed, just comment to that effect in the submission email.

I'm not entirely sure what to do about the discovery that the
DYNTZ path has pre-existing breakage. Perhaps it'd be sensible
to back-patch this patch, minus the catalog change. I doubt that
anyone would have a problem with the nominal change of behavior
near DST boundaries. Or we could just ignore the bug in the back
branches, since it's fairly clear that basically no one uses this
function.

regards, tom lane

Attachment Content-Type Size
v4-0001-timezone_volatility.patch text/x-diff 2.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2021-09-05 19:14:16 Re: pg_dump handling of ALTER DEFAULT PRIVILEGES IN SCHEMA
Previous Message Alvaro Herrera 2021-09-05 18:11:13 Re: corruption of WAL page header is never reported