From: | "Daniel Verite" <daniel(at)manitou-mail(dot)org> |
---|---|
To: | "Dominique Devienne" <ddevienne(at)gmail(dot)com> |
Cc: | "Laurenz Albe" <laurenz(dot)albe(at)cybertec(dot)at>,pgsql-general(at)postgresql(dot)org |
Subject: | Re: LOCALE C.UTF-8 on EDB Windows v17 server |
Date: | 2025-06-05 20:57:24 |
Message-ID: | aded3b3d-4849-4bf4-a072-eecf84f5cd4e@manitou-mail.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Dominique Devienne wrote:
> So you're saying datcollate and datctype from pg_database are
> irrelevant to PostgreSQL itself, and only extensions might be affects?
Almost. An exception that still exists in v18, as far as I can see [1],
is the default full text search parser still using libc functions like
iswdigit(), iswpunct(), iswspace()... that depend on LC_CTYPE.
So you could see differences between OSes in tsvector contents
in a database with the builtin provider.
Unless using LC_CTYPE=C. But then the parsing is suboptimal, since the
parser does not recognize Unicode fancy punctuation signs or spaces as
such.
Personally I would still care to set LC_CTYPE to a reasonable UTF-8 locale
with v17 or v18.
[1]
https://doxygen.postgresql.org/wparser__def_8c.html#a420ea398a8a11db92412a2af7bf45e40
Best regards,
--
Daniel Vérité
https://postgresql.verite.pro/
From | Date | Subject | |
---|---|---|---|
Next Message | Jeff Davis | 2025-06-05 21:11:43 | Re: LOCALE C.UTF-8 on EDB Windows v17 server |
Previous Message | PetSerAl | 2025-06-05 17:09:21 | Re: Combining scalar and row types in RETURNING |