Re: Improve the performance of Unicode Normalization Forms.

From: John Naylor <johncnaylorls(at)gmail(dot)com>
To: Alexander Borisov <lex(dot)borisov(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Improve the performance of Unicode Normalization Forms.
Date: 2025-06-11 07:13:32
Message-ID: CANWCAZZ4Vzn042SFsy-VyD5D_kNhyutzFd3KrJhkfWg9yNZVXA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jun 3, 2025 at 1:51 PM Alexander Borisov <lex(dot)borisov(at)gmail(dot)com> wrote:
> 5. The server part "lost weight" in the binary, but the frontend
> "gained weight" a little.
>
> I read the old commits, which say that the size of the frontend is very
> important and that speed is not important
> (speed is important on the server).
> I'm not quite sure what to do if this is really the case. Perhaps
> we should leave the slow version for the frontend.

In the "small" patch, the frontend files got a few kB bigger, but the
backend got quite a bit smaller. If we decided to go with this patch,
I'd say it's preferable to do it in a way that keeps both paths the
same.

> How was it tested?
> Four files were created for each normalization form: NFC, NFD, NFKC,
> and NFKD.
> The files were sent via pgbench. The files contain all code points that
> need to be normalized.
> Unfortunately, the patches are already quite large, but if necessary,
> I can send these files in a separate email or upload them somewhere.

What kind of workload do they present?
Did you consider running the same tests from the thread that lead to
the current implementation?

--
John Naylor
Amazon Web Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message jian he 2025-06-11 07:28:36 Re: Add SPLIT PARTITION/MERGE PARTITIONS commands
Previous Message Peter Eisentraut 2025-06-11 07:06:33 Re: [19] Proposal: function markers to indicate collation/ctype sensitivity