Re: TSearch vs. Homebrew

From: Tim Allen <tim(at)proximity(dot)com(dot)au>
To: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
Cc: Hannes Dorbath <light(at)theendofthetunnel(dot)de>, pgsql-general(at)postgresql(dot)org
Subject: Re: TSearch vs. Homebrew
Date: 2006-06-28 01:42:03
Message-ID: 44A1DE6B.7050400@proximity.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Oleg Bartunov wrote:
>>> On Tue, 27 Jun 2006, Hannes Dorbath wrote:
>>>
>>>> http://www.symfony-project.com/askeet/21
>>>>
>>>> How does this dead simple approach compare to TSearch performance /
>>>> scaling wise?
>
> Sorry, I was a bit off-topic. Lucene scales as any inverted index based
> engine. In 8.2 tsearch2 also has inverted index support, but we obey
> relational approach and couldn't provide a whole set of optimization,
> which file based engines could provide.

If you read further down the article, you see that what the fellow is
actually doing seems to be not using Lucene, but instead setting up his
own text indexing, ie identifying words, stemming, making a table which
records which words appear in which record etc. Basically he seems to
have re-implemented tsearch2 in a mixture of PHP and MySQL. I can't
imagine how well (or badly...) that must perform for a large amount of
data. The comments at the end are amusing, one fellow quite touching in
his naivety, wondering how much effort it would be to turn the framework
as described into an open source competitor for Google.

My best guess as an answer to the original question is that this
approach would not scale very well at all, and certainly not as well as
tsearch2 (even though tsearch2 doesn't scale quite as well as one might
hope either). And for that matter, it's not all that simple - it seems
to be of a similar order of complexity to tsearch2. However, my
performance estimate is completely unfounded in any actual experience,
so I could be wrong.

Tim

--
-----------------------------------------------
Tim Allen tim(at)proximity(dot)com(dot)au
Proximity Pty Ltd http://www.proximity.com.au/

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Thomas Hallgren 2006-06-28 07:01:49 UUID's as primary keys
Previous Message Glen Parker 2006-06-27 23:20:14 Re: RAID + PostgreSQL?