Re: Unicode update and some tooling improvements

From: Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com>
To: Peter Eisentraut <peter(at)eisentraut(dot)org>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Unicode update and some tooling improvements
Date: 2026-02-27 02:50:13
Message-ID: 906DA1A8-73FD-4BE5-AD82-80C871602BAE@gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On Feb 27, 2026, at 04:36, Peter Eisentraut <peter(at)eisentraut(dot)org> wrote:
>
> This is the annual update of the Unicode data. I also worked a bit on the tooling. The update-unicode target under meson did not update the data in contrib/unaccent/, so I added that. I also fixed a Python deprecation warning in the generation script and made some light changes in the surrounding documentation.
> <0001-Fix-Python-deprecation-warning.patch><0002-doc-Fix-capitalization-of-Unicode.patch><0003-Implement-unaccent-Unicode-data-update-in-meson.patch><0004-Update-RELEASE_CHANGES.patch><0005-Update-Unicode-data-to-CLDR-48.1.patch><0006-Update-Unicode-data-to-Unicode-17.0.0.patch>

Overall looks good to me.

To verify this patch, I upgraded by local ICU to version 78.2, then I tried to run the python script:
```
chaol(at)ChaodeMacBook-Air postgresql % python3 contrib/unaccent/generate_unaccent_rules.py \
--unicode-data-file src/common/unicode/UnicodeData.txt \
--latin-ascii-file contrib/unaccent/Latin-ASCII.xml \
> /tmp/unaccent.rules.new
chaol(at)ChaodeMacBook-Air postgresql %
chaol(at)ChaodeMacBook-Air postgresql %
chaol(at)ChaodeMacBook-Air postgresql % diff -u contrib/unaccent/unaccent.rules /tmp/unaccent.rules.new # no difference
```

And I ran a clean meson build, and specially verified the new Unicode wiring:
```
chaol(at)ChaodeMacBook-Air postgresql % ninja -C build update-unicode # passed
```

And test:
```
chaol(at)ChaodeMacBook-Air postgresql % ninja -C build -t targets | grep update-unicode
update-unicode: phony
chaol(at)ChaodeMacBook-Air postgresql % ninja -C build test # passed
ninja: Entering directory `build'
[406/407] Running all tests

Ok: 333
Fail: 0
Skipped: 30

Full log written to /Users/chaol/Documents/code/postgresql/build/meson-logs/testlog.txt
```

Only a small comment on 0003:
```
# Meson 0.57.0 and 0.57.1 are buggy, therefore >=0.57.2.
- meson_version: '>=0.57.2',
+ # FIXME: update comment
+ meson_version: '>=0.58',
```

Why leaves a FIXME instead of just updating the comment? I saw the installation.sgml doc has been updated.

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2026-02-27 03:05:28 Non-compliant SASLprep implementation for ASCII characters
Previous Message Fujii Masao 2026-02-27 02:45:41 Re: [Patch]Add tab completion for DELETE ... USING