Skip site navigation (1) Skip section navigation (2)

Re: Full text search with ORDER BY performance issue

From: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
To: Krade <krade(at)krade(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Full text search with ORDER BY performance issue
Date: 2009-07-20 12:12:20
Message-ID: Pine.LNX.4.64.0907201604000.8065@sn.sai.msu.ru (view raw or flat)
Thread:
Lists: pgsql-performance
Krade,

On Sat, 18 Jul 2009, Krade wrote:

> Here's a couple of queries:
>
> archive=> explain analyze select * from a where  comment_tsv @@ 
> plainto_tsquery('love') order by timestamp desc limit 24 offset 0;
>
> QUERY PLAN
> ----------
> Limit  (cost=453248.73..453248.79 rows=24 width=281) (actual 
> time=188441.047..188441.148 rows=24 loops=1)
>   ->  Sort  (cost=453248.73..453882.82 rows=253635 width=281) (actual 
> time=188441.043..188441.079 rows=24 loops=1)
>         Sort Key: "timestamp"
>         Sort Method:  top-N heapsort  Memory: 42kB
>         ->  Bitmap Heap Scan on a  (cost=17782.16..446166.02 rows=253635 
> width=281) (actual time=2198.930..187948.050 rows=256378 loops=1)
>               Recheck Cond: (comment_tsv @@ plainto_tsquery('love'::text))
>               ->  Bitmap Index Scan on timestamp_comment_gin 
> (cost=0.00..17718.75 rows=253635 width=0) (actual time=2113.664..2113.664 
> rows=259828 loops=1)
>                     Index Cond: (comment_tsv @@ 
> plainto_tsquery('love'::text))
> Total runtime: 188442.617 ms
> (9 rows)
>
> archive=> explain analyze select * from a where  comment_tsv @@ 
> plainto_tsquery('love') limit 24 offset 0;
>
> QUERY PLAN
> ----------
> Limit  (cost=0.00..66.34 rows=24 width=281) (actual time=14.632..53.647 
> rows=24 loops=1)
>   ->  Seq Scan on a  (cost=0.00..701071.49 rows=253635 width=281) (actual 
> time=14.629..53.588 rows=24 loops=1)
>         Filter: (comment_tsv @@ plainto_tsquery('love'::text))
> Total runtime: 53.731 ms
> (4 rows)
>
> First one runs painfully slow.

Hmm, everything is already written in explain :) In the first query 
253635 rows should be readed from disk and sorted, while in the
second query only 24 (random) rows readed from disk, so there is 4 magnitudes
difference and in the worst case you should expected time for the 1st query
about 53*10^4 ms.

>
> Is there really no way to have efficient full text search results ordered by 
> a separate field? I'm really open to all possibilities, at this point.
>
> Thanks.
>
>

 	Regards,
 		Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

In response to

Responses

pgsql-performance by date

Next:From: Marcin StępnickiDate: 2009-07-20 12:22:03
Subject: Re: Full text search with ORDER BY performance issue
Previous:From: Matthew WakelingDate: 2009-07-20 10:21:03
Subject: Re: Calling conventions

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group