From: | Cees van Zeeland <cees(dot)van(dot)zeeland(at)freedom(dot)nl> |
---|---|
To: | Thomas Munro <thomas(dot)munro(at)gmail(dot)com> |
Cc: | Michael Paquier <michael(at)paquier(dot)xyz>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-bugs(at)lists(dot)postgresql(dot)org |
Subject: | Re: BUG #18362: unaccent rules and Old Greek text |
Date: | 2024-03-01 15:54:07 |
Message-ID: | 63c65b3a-d142-409d-92ec-2a7d1df6f697@freedom.nl |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
Hi Thomas,
I found:
https://www.unicode.org/Public/15.1.0/ucd/CompositionExclusions.txt
that might be useful to tackle characters that we are searching for.
Hope this helps.
Cees
On 01/03/2024 02:53, Thomas Munro wrote:
> On Tue, Feb 27, 2024 at 1:33 AM Cees van Zeeland
> <cees(dot)van(dot)zeeland(at)freedom(dot)nl> wrote:
>> I'm not an expert, but obviously computers make a difference between the two versions of the characters.
>> We are talking about this series:
>> U+1F70 - U+1F7D: ὰ ά ὲ έ ὴ ή ὶ ί ὸ ό ὺ ύ ὼ ώ
>> Is it possible to filter / limit in some way the redirection in the script to this range?
> Right, so to get this in we either need to decide that we're OK with
> adding that many characters, or figure out some systematic way to
> select just the ones we want. One hint that might be helpful if
> someone wants to investigate: I suspect that a lot of those mappings
> might be marked with <font>, which seems to be for code points for
> alternative renderings ("mathematical" bold, italic, fraktur etc), so
> perhaps we could filter them out that way without losing the
> oxia-marked characters if that's the way it has to be.
>
> I think all the relevant part of the character database file is described here:
>
> https://unicode.org/reports/tr44/#Property_Values
>
> The file we're currently using is 15.1:
>
> https://www.unicode.org/Public/15.1.0/ucd/UnicodeData.txt
>
> I registered this thread as https://commitfest.postgresql.org/47/4873/ .
From | Date | Subject | |
---|---|---|---|
Next Message | Alexey Ermakov | 2024-03-01 15:54:36 | Re: BUG #18349: ERROR: invalid DSA memory alloc request size 1811939328, CONTEXT: parallel worker |
Previous Message | Andrei Lepikhov | 2024-03-01 12:48:14 | Re: BUG #18349: ERROR: invalid DSA memory alloc request size 1811939328, CONTEXT: parallel worker |