Re: Full text search with ORDER BY performance issue

From: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
To: Krade <krade(at)krade(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Full text search with ORDER BY performance issue
Date: 2009-07-20 12:12:20
Message-ID: Pine.LNX.4.64.0907201604000.8065@sn.sai.msu.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Krade,

On Sat, 18 Jul 2009, Krade wrote:

> Here's a couple of queries:
>
> archive=> explain analyze select * from a where comment_tsv @@
> plainto_tsquery('love') order by timestamp desc limit 24 offset 0;
>
> QUERY PLAN
> ----------
> Limit (cost=453248.73..453248.79 rows=24 width=281) (actual
> time=188441.047..188441.148 rows=24 loops=1)
> -> Sort (cost=453248.73..453882.82 rows=253635 width=281) (actual
> time=188441.043..188441.079 rows=24 loops=1)
> Sort Key: "timestamp"
> Sort Method: top-N heapsort Memory: 42kB
> -> Bitmap Heap Scan on a (cost=17782.16..446166.02 rows=253635
> width=281) (actual time=2198.930..187948.050 rows=256378 loops=1)
> Recheck Cond: (comment_tsv @@ plainto_tsquery('love'::text))
> -> Bitmap Index Scan on timestamp_comment_gin
> (cost=0.00..17718.75 rows=253635 width=0) (actual time=2113.664..2113.664
> rows=259828 loops=1)
> Index Cond: (comment_tsv @@
> plainto_tsquery('love'::text))
> Total runtime: 188442.617 ms
> (9 rows)
>
> archive=> explain analyze select * from a where comment_tsv @@
> plainto_tsquery('love') limit 24 offset 0;
>
> QUERY PLAN
> ----------
> Limit (cost=0.00..66.34 rows=24 width=281) (actual time=14.632..53.647
> rows=24 loops=1)
> -> Seq Scan on a (cost=0.00..701071.49 rows=253635 width=281) (actual
> time=14.629..53.588 rows=24 loops=1)
> Filter: (comment_tsv @@ plainto_tsquery('love'::text))
> Total runtime: 53.731 ms
> (4 rows)
>
> First one runs painfully slow.

Hmm, everything is already written in explain :) In the first query
253635 rows should be readed from disk and sorted, while in the
second query only 24 (random) rows readed from disk, so there is 4 magnitudes
difference and in the worst case you should expected time for the 1st query
about 53*10^4 ms.

>
> Is there really no way to have efficient full text search results ordered by
> a separate field? I'm really open to all possibilities, at this point.
>
> Thanks.
>
>

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Marcin Stępnicki 2009-07-20 12:22:03 Re: Full text search with ORDER BY performance issue
Previous Message Matthew Wakeling 2009-07-20 10:21:03 Re: Calling conventions