Skip site navigation (1) Skip section navigation (2)

Understanding tsearch2 performance

From: Ivan Voras <ivoras(at)freebsd(dot)org>
To: pgsql-performance(at)postgresql(dot)org
Subject: Understanding tsearch2 performance
Date: 2010-07-14 11:47:52
Message-ID: i1k84l$984$1@dough.gmane.org (view raw or flat)
Thread:
Lists: pgsql-performance
Here's a query and its EXPLAIN ANALYZE output:

cms=> select count(*) from forum;
 count
-------
 90675
(1 row)

cms=> explain analyze select id,title from forum where _fts_ @@
'fer'::tsquery;
                                                      QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on forum  (cost=29.21..361.21 rows=91 width=35)
(actual time=2.946..63.646 rows=8449 loops=1)
   Recheck Cond: (_fts_ @@ '''fer'''::tsquery)
   ->  Bitmap Index Scan on forum_fts  (cost=0.00..29.19 rows=91
width=0) (actual time=2.119..2.119 rows=8449 loops=1)
         Index Cond: (_fts_ @@ '''fer'''::tsquery)
 Total runtime: 113.641 ms
(5 rows)

The problem is - tsearch2 seems too slow. I have nothing to compare it
to but 113 ms for searching through this small table of 90,000 records
seems too slow. The forum_fts index is of GIN type and the table
certainly fits into RAM.

When I issue a dumb query without an index, I get a comparable order of
magnitude performance:

cms=> explain analyze select id,title from forum where content ilike
'%fer%';
                                                 QUERY PLAN

------------------------------------------------------------------------------------------------------------
 Seq Scan on forum  (cost=0.00..7307.44 rows=3395 width=35) (actual
time=0.030..798.375 rows=10896 loops=1)
   Filter: (content ~~* '%fer%'::text)
 Total runtime: 864.384 ms
(3 rows)

cms=> explain analyze select id,title from forum where content like '%fer%';
                                                QUERY PLAN

-----------------------------------------------------------------------------------------------------------
 Seq Scan on forum  (cost=0.00..7307.44 rows=3395 width=35) (actual
time=0.024..146.959 rows=7596 loops=1)
   Filter: (content ~~ '%fer%'::text)
 Total runtime: 191.732 ms
(3 rows)

Some peculiarities of the setup which might or might not influence this
performance:

1) I'm using ICU-patched postgresql because I cannot use my UTF-8 locale
otherwise - this is why the difference between the dumb queries is large
(but I don't see how this can influence tsearch2 since it pre-builds the
tsvector data with lowercase lexemes)

2) My tsearch2 lexer is somewhat slow (but I don't see how it can
influence these read-only queries on a pre-built, lexed and indexed data)

Any ideas?


Responses

pgsql-performance by date

Next:From: Oleg BartunovDate: 2010-07-14 12:31:28
Subject: Re: Understanding tsearch2 performance
Previous:From: Scott MarloweDate: 2010-07-14 06:44:53
Subject: Re: Need help in performance tuning.

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group