Re: Solaris versus our NLS files

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Solaris versus our NLS files
Date: 2025-12-09 22:23:19
Message-ID: 299454.1765318999@sss.pgh.pa.us
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thomas Munro <thomas(dot)munro(at)gmail(dot)com> writes:
> On Wed, Dec 10, 2025 at 10:22 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> After some quality time with Google, I learned why: with Solaris's
>> apparently-locally-hacked version of gettext, it's not good enough
>> to have $INSTALLATION/share/locale/ subdirectories named like
>> "es", "fr", etc. They have to be named after the
>> fully-spelled-out locale names like "es_ES.UTF-8".

> Is it really locally hacked, or is it just Sun's libc[1], which
> invented gettext() in the first place, and then later added GNU's
> extensions and .mo format after GNU's reimplementation became
> widespread?

Sorry, I was imprecise there. This is Solaris' libc implementation:
configure reports

configure:18402: checking for library containing bind_textdomain_codeset
configure:18450: result: none required

and I don't see any libintl listed in "ldd postgres" either.

> From some (very) limited research on the topic, one thing
> that GNU's reimplementation added that Sun's never had is the ability
> to open a .mo with the wrong encoding and transcode it. Perhaps that
> explains Sun's insistence on finding an exact match, and I guess that
> might mean that you could get either mojibake or some kind of error if
> you create codesetless symlinks (which I guess it would normally only
> use when your locale's name doesn't have the codeset suffix, and then
> I guess it would expect Latin-9 or whatever it thinks "es_ES" has)?

Like some other platforms, it flat out won't accept codeset-less
lc_messages settings:

postgres=# SET lc_messages = 'es_ES';
ERROR: invalid value for parameter "lc_messages": "es_ES"
postgres=# SET lc_messages = 'es_ES.UTF-8';
SET
postgres=# select 1/0;
ERROR: división por cero

This is with the symlink in place. Yes I did try making a symlink
named "es_ES", but apparently there's some central source of truth
about what the valid locale names are.

It apparently is possible to install GNU gettext on top of Solaris,
although you then get into some fun about conflicts between GNU-
and OS-supplied headers. But I've not tried that here.

If you're right about Sun not doing transcoding, then I guess we would
only need to create symlinks matching the encodings used in our .po
files, which'd remove the symlink bloat problem and replace it with
how-do-we-extract-that-encoding-name ... although it looks like all
but one is in UTF-8, so maybe we should just decree they have to be
in UTF-8? The lone exception is src/bin/pg_config/po/nb.po, which
seems not to have been touched since 2013.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2025-12-09 22:37:34 Re: Consistently use palloc_object() and palloc_array()
Previous Message Nico Williams 2025-12-09 22:22:18 Re: Solaris versus our NLS files