Re: Full text search in Chinese

From: Lincoln Yeoh <lyeoh(at)pop(dot)jaring(dot)my>
To: Mike Chamberlain <mikeachamberlain(at)gmail(dot)com>, pgsql-general(at)postgresql(dot)org
Subject: Re: Full text search in Chinese
Date: 2010-10-26 18:05:47
Message-ID: 20101026180649.B99F41336DB6@mail.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

At 11:42 AM 10/25/2010, Mike Chamberlain wrote:
>Has anyone implemented FTS in Chinese on PG? Â I
>guess I need a Chinese ispell dictionary and
>parser, neither of which I can find after a lot of googling.
>
>I have a bounty on this question on Stackoverflow if anyone wants to claim it:
>
><http://stackoverflow.com/questions/3994504/how-do-i-implement-full-text-search-in-chinese-on-postgresql>http://stackoverflow.com/questions/3994504/how-do-i-implement-full-text-search-in-chinese-on-postgresql
>
>Thanks,
>
>Mike

What sort of usage would you be expecting? e.g. search terms.

Written chinese is a character based language,
not an alphabet style language. To complicate
things a bit, there are two main character sets-
Traditional Chinese and Simplified Chinese.

Chinese characters would be the equivalent of an
English keyword. But lots of "words"/"meanings"
would require two or more characters. You might
be able to handle this similar to the way english
phrases are handled (indexed and searched for),
after all "bee's knees" usually means a different
thing from the actual bee's knees.

Japanese on the other hand, has _three_ main
scripts. Two for "alphabet style", and one "chinese character style"...

Regards,

Link.

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Steeles 2010-10-26 18:27:01 What is better method to backup postgresql DB.
Previous Message Alan Hodgson 2010-10-26 17:33:51 Re: Why Select Count(*) from table - took over 20 minutes?