Support TZ format code in to_timestamp()

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Support TZ format code in to_timestamp()
Date: 2023-06-13 16:20:42
Message-ID: 1681086.1686673242@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

It's annoyed me for some time that to_timestamp() doesn't implement
the TZ format code that to_char() has. I finally got motivated to
do something about that after the complaint at [1] that jsonpath's
datetime() method can't read typical JSON.stringify() output like
"2023-05-22T03:09:37.825Z". We do already understand "Z" as a
time zone abbreviation for UTC; we just need to get formatting.c
to support this.

Hence, the attached patch teaches to_timestamp() to read time
zone abbreviations as well as HH and HH:MM numeric zone offsets
when TZ is specified. (We need to accept HH and HH:MM to be
sure that we can round-trip the output of to_char(), since its
TZ output code will fall back to one of those if it does not
know any abbreviation for the current zone.)

You might reasonably say that we should make it read time zone names
not only abbreviations. I tried to make that work, and realized that
it'd create a complete mess because tzload() is so lax about what it
will interpret as a POSIX-style timezone spec. I included an example
in the test cases below: I think that

to_timestamp('2011-12-18 11:38ESTFOO24', 'YYYY-MM-DD HH12:MITZFOOSS')

should work and read just "EST" as the TZ value, allowing the "24"
to be read as the SS value. But tzload() will happily eat all of
"ESTFOO24" as a POSIX zone spec.

We could conceivably refactor tzload() enough to match only tzdb zone
names in this context. But I'm very hesitant to do that, for a few
reasons:

* it would make localtime.c diverge significantly from the upstream
IANA source code;

* we only need to support zone abbreviations to ensure we can
round-trip the output of to_char();

* it's not clear to me that average users would understand why
to_timestamp() accepts some but not all zone names that are accepted
by the TimeZone GUC and timestamptz input. If we document it as
taking only timezone abbreviations, that does correspond to a
concept that's in the manual already.

So I think that the attached represents a reasonable and useful
compromise. I'll park this in the July commitfest.

regards, tom lane

[1] https://www.postgresql.org/message-id/flat/014A028B-5CE6-4FDF-AC24-426CA6FC9CEE%40mohiohio.com

Attachment Content-Type Size
v1-0001-Support-timezone-abbreviations-in-to_timestamp.patch text/x-diff 22.0 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2023-06-13 16:58:54 Add wait event for log emission?
Previous Message James Coleman 2023-06-13 15:53:39 Re: RFC: Logging plan of the running query