Re: Improve the performance of Unicode Normalization Forms.

From: Alexander Borisov <lex(dot)borisov(at)gmail(dot)com>
To: Jeff Davis <pgsql(at)j-davis(dot)com>, Victor Yegorov <vyegorov(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Improve the performance of Unicode Normalization Forms.
Date: 2025-09-29 10:22:29
Message-ID: a2ba2920-cdd6-43e5-a9ac-079adcb2ff78@gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

20.09.2025 04:03, Jeff Davis wrote:
> On Thu, 2025-09-11 at 20:51 +0300, Alexander Borisov wrote:
>>
>>> Hey.
>>>
>>> I've looked into these patches.
>>
>> Hi Victor,
>>
>> Thank you for reviewing the patch and testing it!
>
> Heikki, do you have thoughts on this thread?

Hey,

In patch v5 (attached), I changed the approach to class caching.
Now, for small texts (less than 512 characters), we don't allocate
memory from the heap; we use the stack.

And according to pgbench tests, we have a 2x speedup.
According to Jeff's tests, we have a 10x speedup.
According to tests from the thread
https://www.postgresql.org/message-id/CAFBsxsHUuMFCt6-pU+oG-F1==CmEp8wR+O+bRouXWu6i8kXuqA@mail.gmail.com,
we also have a 2x speedup.

--
Regards,
Alexander Borisov

Attachment Content-Type Size
v5-0001-Moving-Perl-functions-Sparse-Array-to-a-common-mo.patch text/plain 12.6 KB
v5-0002-Improve-the-performance-of-Unicode-Normalization-.patch text/plain 1.0 MB
v5-0003-Refactoring-Unicode-Normalization-Forms-performan.patch text/plain 1.3 MB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Burd 2025-09-29 10:27:01 Re: [PATCH] Add tests for Bitmapset
Previous Message Peter Eisentraut 2025-09-29 10:20:00 Remove Item type