Re: Understanding tsearch2 performance

From: Ivan Voras <ivoras(at)freebsd(dot)org>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: Understanding tsearch2 performance
Date: 2010-07-14 13:37:56
Message-ID: i1kej2$12t$1@dough.gmane.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 07/14/10 15:25, Oleg Bartunov wrote:
> On Wed, 14 Jul 2010, Ivan Voras wrote:
>
>>> Returning 8449 rows could be quite long.
>>
>> You are right, I didn't test this. Issuing a query which returns a
>> smaller result set is much faster.
>>
>> But, offtopic, why would returning 8500 records, each around 100 bytes
>> long so around 8.5 MB, over local unix sockets, be so slow? The machine
>> in question has a sustained memory bendwidth of nearly 10 GB/s. Does
>> PostgreSQL spend much time marshalling the data through the socket
>> stream?
>
> It's disk access time.
> in the very bad case it could take ~5 ms (for fast drive) to get one just
> one row.

No, it's not that. The table fits in RAM, I've verified there is no disk
IO involved. Something else is wrong:

cms=> explain analyze select id,title from forum where _fts_ @@
'fer'::tsquery limit 10;
QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..43.31 rows=10 width=35) (actual time=0.194..0.373
rows=10 loops=1)
-> Index Scan using forum_fts on forum (cost=0.00..394.10 rows=91
width=35) (actual time=0.182..0.256 rows=10 loops=1)
Index Cond: (_fts_ @@ '''fer'''::tsquery)
Total runtime: 0.507 ms
(4 rows)

cms=> explain analyze select id,title from forum where _fts_ @@
'fer'::tsquery order by id limit 10;
QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------------
Limit (cost=363.18..363.20 rows=10 width=35) (actual
time=118.358..118.516 rows=10 loops=1)
-> Sort (cost=363.18..363.40 rows=91 width=35) (actual
time=118.344..118.396 rows=10 loops=1)
Sort Key: id
Sort Method: top-N heapsort Memory: 25kB
-> Bitmap Heap Scan on forum (cost=29.21..361.21 rows=91
width=35) (actual time=3.066..64.091 rows=8449 loops=1)
Recheck Cond: (_fts_ @@ '''fer'''::tsquery)
-> Bitmap Index Scan on forum_fts (cost=0.00..29.19
rows=91 width=0) (actual time=2.106..2.106 rows=8449 loops=1)
Index Cond: (_fts_ @@ '''fer'''::tsquery)
Total runtime: 118.689 ms
(9 rows)

See in the first query where I have a simple LIMIT, it fetches random 10
rows quickly, but in the second one, as soon as I give it to execute and
calculate the entire result set before I limit it, the performance is
horrible.

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Stephen Frost 2010-07-14 13:49:28 Re: Understanding tsearch2 performance
Previous Message Oleg Bartunov 2010-07-14 13:25:27 Re: Understanding tsearch2 performance