Re: [Fwd: Re: tsearch in core patch]

From: "Mike Rylander" <mrylander(at)gmail(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Tatsuo Ishii" <ishii(at)sraoss(dot)co(dot)jp>, euler(at)timbira(dot)com, teodor(at)sigaev(dot)ru, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [Fwd: Re: tsearch in core patch]
Date: 2007-06-25 13:22:33
Message-ID: b918cf3d0706250622n6b4df67avf2a9ca9c4f6e8f48@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 6/25/07, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Well, it's not hard at all to find chunks of English text that have
> embedded bits of French, Spanish, or what-have-you, but that's not an
> argument for trying to intermix the stemmers. I doubt that such simple
> bits of program could tell the language difference well enough to
> determine which stemming rules to apply.
>

While I imagine that is probably true of many, if not most, my project
in particular would greatly benefit from the ability to mix stemmers.
I work with complex bibliographic data, which has language information
embedded within records. This is not limited to the record level
either. Individual fields within each bibliographic record can be in
different langauges.

Especially in countries where making software multi-lingual (such as
Canada (en_CA/fr_CA)) is a requirement for use in public institutions,
the ability to choose a stemmer and stop-word list at will for any
particular record will actually provide the exact behavior needed.
The obvious generalization from Canada would be to support any mix of
languages supported by tsearch2.

I can certainly understand the benefit of making the default
configuration a simple locale to language map, but there are
definitely uses for searching using different stemmers/stop-lists even
within the same corpus/index. So, as a datapoint for the discussion,
I would ask that the option of multiple languages per DB locale not be
removed if it can be at all avoided.

Thanks for listening (and for all the great work on getting tsearch
into core! :) ...

--
Mike Rylander

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2007-06-25 14:06:57 Re: msvc and vista fun
Previous Message Andrew Dunstan 2007-06-25 13:13:51 Re: msvc and vista fun