Re: english parser in text search: support for multiple words in the same position

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Sushant Sinha <sushant354(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Markus Wanner <markus(at)bluegap(dot)ch>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: english parser in text search: support for multiple words in the same position
Date: 2010-09-30 00:44:56
Message-ID: AANLkTimmykArc7DwhM6tyOGCQnwzYM8D2CR0tES7vtRY@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Sep 29, 2010 at 1:29 AM, Sushant Sinha <sushant354(at)gmail(dot)com> wrote:
> Any updates on this?
>
>
> On Tue, Sep 21, 2010 at 10:47 PM, Sushant Sinha <sushant354(at)gmail(dot)com>
> wrote:
>>
>> > I looked at this patch a bit.  I'm fairly unhappy that it seems to be
>> > inventing a brand new mechanism to do something the ts parser can
>> > already do.  Why didn't you code the url-part mechanism using the
>> > existing support for compound words?
>>
>> I am not familiar with compound word implementation and so I am not sure
>> how to split a url with compound word support. I looked into the
>> documentation for compound words and that does not say much about how to
>> identify components of a token. Does a compound word split by matching
>> with a list of words? If yes, then we will not be able to use that as we
>> do not know all the words that can appear in a url/host/email/file.

It seems to me that you need to familiarize yourself with this stuff
and then post an analysis, or a new patch.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Stark 2010-09-30 00:46:51 Re: Path question
Previous Message Gurjeet Singh 2010-09-30 00:42:48 Patch to reindex primary keys