Re: Slow performance of collate "en_US.utf8"

From: Joe Conway <mail(at)joeconway(dot)com>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, Alexey Borschev <a(dot)borschev(at)postgrespro(dot)ru>, pgsql-performance(at)lists(dot)postgresql(dot)org
Subject: Re: Slow performance of collate "en_US.utf8"
Date: 2025-03-01 00:11:13
Message-ID: 4ae34c31-b413-4b7e-91c3-63b9ae5da3c3@joeconway.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 2/28/25 17:49, Thomas Munro wrote:
> On Sat, Mar 1, 2025 at 9:03 AM Joe Conway <mail(at)joeconway(dot)com> wrote:
>> On 2/28/25 09:16, Laurenz Albe wrote:
>> > On Thu, 2025-02-27 at 16:54 +0300, Alexey Borschev wrote:
>> >> I see poor performance of text sorting of collate "en_US.utf8" in PG 17.4.
>> >
>> > I'd say that you would have to complain to the authors of the
>> > GNU C library, which provides this collation.
>>
>> Yep -- glibc starting with version 2.21 has a massive performance
>> regression for certain cases and the glibc folks have basically said
>> they will not fix it. If you try the same thing on RHEL 7.x with glibc
>> 2.17 it will perform about the same as ICU.
>
> I've idly wondered if this is the culprit, do you know?
>
> https://github.com/bminor/glibc/commit/0742aef6e52a935f9ccd69594831b56d807feef3

Yes, that was definitely the one that caused the regression. Note that
if you look closely you will find there is a revert of that patch on
glibc on certain distros. But not on RHEL and RHEL-alike.

Someone else pointed out this thread to me:
https://sourceware.org/bugzilla/show_bug.cgi?id=18441

Note the last message on that thread:
8<--------------
Carlos O'Donell 2019-05-09 20:44:56 UTC

(In reply to vectoroc from comment #13)
> Hello. Is there any chance that the issues will be fixed? Unfortunately
> PostgreSQL Is unable to use ICU some base features (e.g in analyze
> operation).

We haven't had anyone working on strcoll_l performance improvements. So
it's unlikely that this will get merged or reviewed any time soon.
8<--------------

--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Pavel Stehule 2025-03-01 07:23:24 Re: Re: proposal: schema variables
Previous Message Thom Brown 2025-02-28 23:19:12 Re: [PERFORM] Unused index influencing sequential scan plan