Re: Full text search with ORDER BY performance issue

From: PFC <lists(at)peufeu(dot)com>
To: "Robert Haas" <robertmhaas(at)gmail(dot)com>, "Oleg Bartunov" <oleg(at)sai(dot)msu(dot)su>
Cc: Krade <krade(at)krade(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Full text search with ORDER BY performance issue
Date: 2009-07-29 14:18:38
Message-ID: op.uxt11cs6cigqcu@soyouz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance


> If love is an uncommon word, there's no help for queries of this type
> being slow unless the GIN index can return the results in order. But
> if love is a common word, then it would be faster to do an index scan
> by timestamp on the baserel and then treat comment_tsv @@
> plainto_tsquery('love') as a filter condition. Is this a selectivity
> estimation bug?

If you have really lots of documents to index (this seems the case)
perhaps you should consider Xapian. It is very easy to use (although, of
course, tsearch integrated in Postgres is much easier since you have
nothing to install), and it is *incredibly* fast.

In my tests (2 years ago) with many gigabytes of stuff to search into,
differences became obvious when the data set is much bigger than RAM.
- Postgres' fulltext was 10-100x faster than MySQL fulltext on searches
(lol) (and even a lot "more faster" on INSERTs...)
- and Xapian was 10-100 times faster than Postgres' fulltext.

(on a small table which fits in RAM, differences are small).

Of course Xapian is not Postgres when you talk about update
concurrency..........
(single writer => fulltext index updating background job is needed, a
simple Python script does the job)

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2009-07-29 14:22:24 Re: Full text search with ORDER BY performance issue
Previous Message Robert Haas 2009-07-29 14:02:11 Re: Full text search with ORDER BY performance issue