tsearch with Turkish locale ( was Re: foreign_data test fails with non-C locale)

From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Devrim GÜNDÜZ <devrim(at)gunduz(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: tsearch with Turkish locale ( was Re: foreign_data test fails with non-C locale)
Date: 2009-01-19 15:03:33
Message-ID: 49749645.5070801@gmx.net
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-hackers

Devrim GÜNDÜZ wrote:
> Yep, I ran them already, and as you wrote, I'm getting 3 errors (tsearch
> tests + foreign_data test).
>
>> And then use your language skills to determine what the correct
>> behavior is. ;-)
>
> SKIES would be skıes (dotless i).
>
> Here is the conversion table:
>
> I (capital) <-> ı
> İ (capital <-> i

I think the test show that there is a bug in the tsearch support for
Turkish. Here is the test diff:

--- expected/tsearch.out 2008-10-18 12:56:29.000000000 +0300
+++ results/tsearch.out 2009-01-19 16:26:51.000000000 +0200
@@ -962,38 +962,38 @@
SELECT to_tsvector('SKIES My booKs');
to_tsvector
----------------------------
- 'books':3 'my':2 'skies':1
+ 'books':3 'my':2 'skIes':1
(1 row)
[and more of the same]

This is not correct under either Turkish or non-Turkish language rules.

Note that

postgres=# select lower('SKIES');
lower
-------
skıes
(1 row)

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Magnus Hagander 2009-01-19 15:18:54 Re: Re: [COMMITTERS] pgsql: Explicitly bind gettext() to the UTF8 locale when in use.
Previous Message Magnus Hagander 2009-01-19 14:51:33 Textdomains