Re: [PATCH] Fix severe performance regression with gettext 0.20+ on Windows

From: Bryan Green <dbryan(dot)green(at)gmail(dot)com>
To: Peter Eisentraut <peter(at)eisentraut(dot)org>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: [PATCH] Fix severe performance regression with gettext 0.20+ on Windows
Date: 2026-02-16 17:10:21
Message-ID: bd8fd94e-c420-465b-97b3-2bb272c2136d@gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2/16/2026 9:34 AM, Peter Eisentraut wrote:
> On 04.02.26 16:08, Bryan Green wrote:
>> On 1/20/2026 2:39 PM, Peter Eisentraut wrote:
>>> On 08.01.26 15:57, Bryan Green wrote:
>>>> I agree with the above changes and have implemented them, including the
>>>> correction to the cutoff version.  But, before sharing the patch with
>>>> those changes I think we should discuss 1) should we short-circuit
>>>> C/POSIX and not ever call gettext in that case,
>>>
>>> You had written that you had submitted a patch to gettext to handle that
>>> there.  Has that gotten anywhere?
>>>
>>>> 2) should we try to
>>>> convert "ISO" to Windows legacy format.
>>>
>>> I don't know.  We can just tell users to set their locale in the right
>>> format.
>> Peter,
>> I have attached the patch with the changes you suggested/requested.  The
>> patch was added to gnulib in December.  The latest release of gnu
>> gettext (1.0) does include the patch.  Yes, they jumped from 0.26 to 1.0.
>
> The newly released gettext 1.0 is now available in MSYS2, so I tested
> this again.  The new gettext indeed makes a significant performance
> improvement compared to my test results with earlier gettext versions
> (about 10x faster).
>
> This is all without any PostgreSQL patch.
>
> But when I apply your patch, it actually makes things worse (by about
> 25%).  This is incomprehensible to me, but it's very reproducible.  I'm
> not sure how to proceed now.
>

Peter,

TLDR; We don't need the patch for gettext < 20 and >= 1.0. The negative
caching provides a fast failure path. The more correct windows path
adds some cost. I think both are faster than the current extreme slowness.

Note: I would like to point out that we currently do not respect the
language preferences on Windows because we set LC_MESSAGES. In gettext
the MUI flag is disabled by default. If we wanted to respect the
language preference of the user we would have to enable the flag and not
set LC_MESSAGES. For the future...

I dug into the gettext 1.0 source code to understand why the patch
actually makes things worse, and I think I can explain what's happening.

It comes down to two different code paths in gnulib's
localename-unsafe.c for resolving locale names on native Windows, and
they have very different performance characteristics even with the cache
fix in place.

Without the patch, PostgreSQL sets LC_MESSAGES to a POSIX name like
"en_US" via IsoLocaleName(). When gettext calls
setlocale(LC_MESSAGES, NULL), it gets back "en_US" and passes it to
get_lcid(). The enum_locales_fn callback only compares against
Windows-format names (built via GetLocaleInfo(LOCALE_SENGLANGUAGE) +
"_" + GetLocaleInfo(LOCALE_SENGCOUNTRY)), so "en_US" never matches.
The lookup fails, returns LCID 0, and with the new LRU cache in gettext
1.0 that failure is cached on the first call. Subsequent calls hit the
cache immediately. The function then falls through to
gl_locale_name_environ(), which just reads the LC_MESSAGES environment
variable and returns the "en_US" string directly. That becomes the
single_locale used to find the .mo file.

With the patch, PostgreSQL passes "English_United States.1252" through.
Now get_lcid() succeeds -- it enumerates all ~259 system locales,
finds a match, and returns a valid LCID (also cached after first call).
But then it has to call gl_locale_name_from_win32_LCID() →
gl_locale_name_from_win32_LANGID() to convert back to POSIX format.
That function calls GetACP() and getenv("GETTEXT_MUI") on every
invocation -- neither is cached -- plus does a large nested switch on
the LANGID. The end result is the same "en_US" string, but obtained
through a more expensive path.

Both paths produce the exact same single_locale value and the same
.mo file lookup. When there's no .mo file, _nl_load_domain() caches
the failure (decided = 1, data = NULL), so subsequent calls don't
touch the filesystem. The only difference is the per-call overhead in
guess_category_value() → gl_locale_name_posix(), which runs on
every gettext() call. The POSIX path resolves cheaply through a
cached get_lcid() failure plus a simple getenv(), while the Windows
path goes through a cached get_lcid() success followed by uncached
GetACP(), getenv("GETTEXT_MUI"), and the LANGID switch table.

That explains the ~25% regression: the "correct" Windows locale path is
paradoxically slower than the "failing" POSIX path, because the failure
is cheap and cached while the success triggers additional uncached work
on every call.

With gettext 1.0's cache fix, I don't think we need the patch at all.
The POSIX names produce the right .mo file paths and perform better.
The original caching bug that motivated the patch -- where failed
lookups caused repeated EnumSystemLocales() calls -- is fixed in
gettext 1.0's LRU cache in get_lcid().

--
Bryan Green
EDB: https://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2026-02-16 17:15:20 Re: index prefetching
Previous Message Nathan Bossart 2026-02-16 17:04:58 Re: Speed up COPY FROM text/CSV parsing using SIMD