From: | Gregory Stark <stark(at)enterprisedb(dot)com> |
---|---|
To: | "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>, "Alvaro Herrera" <alvherre(at)commandprompt(dot)com>, "Oleg Bartunov" <oleg(at)sai(dot)msu(dot)su>, "Teodor Sigaev" <teodor(at)sigaev(dot)ru>, <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Latin vs non-Latin words in text search parsing |
Date: | 2007-10-23 15:19:19 |
Message-ID: | 87k5pdq2o8.fsf@oxford.xeocode.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:
> I wrote:
>> Maybe "aword", "word", and "numword"?
>
> Does the lack of response mean people are satisfied with that?
Sorry, I had a couple responses partially written but never finished.
If we were doing it from scratch I would suggest using longer names. At the
least I would still suggest using "ascii" or "asciiword" instead of "aword".
> Fleshing the proposal out to include the hyphenated-word categories:
>
> aword All ASCII letters
> word All letters according to iswalpha()
> numword Mixed letters and digits (all iswalnum())
This does bring up another idea. Using the ctype names. They could be named
asciiword, alphaword, alnumword. Frankly I don't think this is any nicer than
numword anyways.
> I'm not totally thrilled with these short names for the hyphenation
> categories, but they will seem at least somewhat familiar to users
> of contrib/tsearch2, and it's probably not worth changing them just
> to make them look prettier.
I tried thinking of better words for this and couldn't think of any. The only
other word for a hyphenated word I could think of is probably "compound" and
the word for parts of a compound word is "lexeme", but that's certainly not
going to be clearer (and technically it's not quite right anyway).
So in short I would still suggest using "ascii" instead of just "a" but
otherwise I think your suggestion is best.
--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
From | Date | Subject | |
---|---|---|---|
Next Message | Alvaro Herrera | 2007-10-23 15:28:53 | Re: Latin vs non-Latin words in text search parsing |
Previous Message | Tom Lane | 2007-10-23 15:16:24 | Re: Latin vs non-Latin words in text search parsing |