Re: Allow to_date() and to_timestamp() to accept localized names

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>
Cc: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Juan José Santamaría Flecha <juanjo(dot)santamaria(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Arthur Zakirov <zaartur(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Allow to_date() and to_timestamp() to accept localized names
Date: 2020-01-28 17:30:42
Message-ID: 15575.1580232642@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com> writes:
> But then the manual page goes on to say:

>> %E* %O*
>> POSIX locale extensions. The sequences %Ec %EC %Ex %EX %Ey %EY %Od %Oe %OH %OI %Om %OM %OS %Ou %OU %OV %Ow %OW %Oy are supposed to provide alternate representations.
>>
>> Additionally %OB implemented to represent alternative months names (used standalone, without day mentioned).

> This is the part I haven’t played with, but it sounds like it can handle at least one alternate name. Perhaps you can get the alternates this way?

This sounded promising, but the POSIX strftime spec doesn't mention %OB,
so I'm afraid we can't count on it to do much. At this point I'm not
really convinced that there are no languages with more than two forms,
anyway :-(.

I also wondered whether we could get any further by using strptime() to
convert localized month and day names on-the-fly, rather than the patch's
current approach of re-using strftime() results. If strptime() fails
to support alternative names, it's their bug not ours. Unfortunately,
glibc has got said bug (AFAICS anyway), so in practice this would only
offer us plausible deniability and not much of any real functionality.

In the end it seems like we could only handle alternative names by
keeping our own lists of them. There are probably few enough cases
that that wouldn't be a tremendous maintenance problem, but what
I'm not quite seeing is how we'd decide which list to use when.
Right now, locale identifiers are pretty much opaque to us ... do
we really want to get into the business of recognizing that such a
name refers to German, or Greek?

A brute-force answer, if there are few enough cases, is to recognize
them regardless of the specific value of LC_TIME. Do we think
anybody would notice or care if to_date('Sonnabend', 'TMDay') works
even when in a non-German locale?

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2020-01-28 17:54:50 Re: [Proposal] Global temporary tables
Previous Message Robert Haas 2020-01-28 17:20:07 Re: making the backend's json parser work in frontend code