Skip site navigation (1) Skip section navigation (2)

Queryplan within FTS/GIN index -search.

From: Jesper Krogh <jesper(at)krogh(dot)cc>
To: pgsql-performance(at)postgresql(dot)org
Subject: Queryplan within FTS/GIN index -search.
Date: 2009-10-22 16:28:13
Message-ID: 4AE0881D.9070201@krogh.cc (view raw or flat)
Thread:
Lists: pgsql-performance
Hi

My indexing base is now up to 7.5m documents, I have raise statistics
target to 1000 for the tsvector column in order to make the
query-planner choose more correctly. That works excellent.

Table structure is still:
ftstest=# \d ftsbody
                               Table "public.ftsbody"
      Column      |   Type   |                      Modifiers

------------------+----------+------------------------------------------------------
 id               | integer  | not null default
nextval('ftsbody_id_seq'::regclass)
 body             | text     | not null default ''::text
 ftsbody_body_fts | tsvector |
Indexes:
    "ftsbody_body_md5" UNIQUE, btree (md5(body))
    "ftsbody_id_pri_idx" UNIQUE, btree (id)
    "ftsbody_tfs_idx" gin (ftsbody_body_fts)
Triggers:
    tsvectorupdate BEFORE INSERT OR UPDATE ON uniprot FOR EACH ROW
EXECUTE PROCEDURE tsvector_update_trigger('ftsbody_body_fts',
'pg_catalog.english', 'body')


I'm searching the gin-index for 1-5 terms, where all of them matches the
same document. TERM1 is unique by itself, TERM2 is a bit more common (52
rows), TERM3 more common, TERM4 close to all and TERM5 all records.

Just quering for a unique value and add in several values that match
everything makes the run-time go significantly up.

I somehow would expect the index-search to take advantage of the MCV's
informations in the statistics that sort of translate it into a search
and post-filtering (as PG's queryplanner usually does at the SQL-level).

                                                              QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=102.45..102.45 rows=1 width=751) (actual time=3.726..3.729
rows=1 loops=1)
   ->  Sort  (cost=102.45..102.45 rows=1 width=751) (actual
time=3.722..3.723 rows=1 loops=1)
         Sort Key: id
         Sort Method:  quicksort  Memory: 27kB
         ->  Bitmap Heap Scan on ftsbody  (cost=100.42..102.44 rows=1
width=751) (actual time=3.700..3.702 rows=1 loops=1)
               Recheck Cond: (ftsbody_body_fts @@ to_tsquery('TERM1 &
TERM2'::text))
               ->  Bitmap Index Scan on ftsbody_tfs_idx
(cost=0.00..100.42 rows=1 width=0) (actual time=3.683..3.683 rows=1 loops=1)
                     Index Cond: (ftsbody_body_fts @@ to_tsquery('TERM1
& TERM2'::text))
 Total runtime: 3.790 ms
(9 rows)

                                                                QUERY
PLAN
------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=102.45..102.45 rows=1 width=751) (actual
time=850.017..850.020 rows=1 loops=1)
   ->  Sort  (cost=102.45..102.45 rows=1 width=751) (actual
time=850.013..850.015 rows=1 loops=1)
         Sort Key: id
         Sort Method:  quicksort  Memory: 27kB
         ->  Bitmap Heap Scan on ftsbody  (cost=100.42..102.44 rows=1
width=751) (actual time=849.991..849.993 rows=1 loops=1)
               Recheck Cond: (ftsbody_body_fts @@ to_tsquery('TERM1 &
TERM2 & TERM3'::text))
               ->  Bitmap Index Scan on ftsbody_tfs_idx
(cost=0.00..100.42 rows=1 width=0) (actual time=849.970..849.970 rows=1
loops=1)
                     Index Cond: (ftsbody_body_fts @@ to_tsquery('TERM1
& TERM2 & TERM3'::text))
 Total runtime: 850.084 ms
(9 rows)

                                                                 QUERY
PLAN
--------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=102.45..102.45 rows=1 width=751) (actual
time=1152.065..1152.068 rows=1 loops=1)
   ->  Sort  (cost=102.45..102.45 rows=1 width=751) (actual
time=1152.061..1152.062 rows=1 loops=1)
         Sort Key: id
         Sort Method:  quicksort  Memory: 27kB
         ->  Bitmap Heap Scan on ftsbody  (cost=100.42..102.44 rows=1
width=751) (actual time=1152.039..1152.041 rows=1 loops=1)
               Recheck Cond: (ftsbody_body_fts @@ to_tsquery('TERM1 &
TERM2 & TERM3 & TERM4'::text))
               ->  Bitmap Index Scan on ftsbody_tfs_idx
(cost=0.00..100.42 rows=1 width=0) (actual time=1152.020..1152.020
rows=1 loops=1)
                     Index Cond: (ftsbody_body_fts @@ to_tsquery('TERM1
& TERM2 & TERM3 & TERM4'::text))
 Total runtime: 1152.129 ms
(9 rows)

                                                                 QUERY
PLAN
--------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=102.45..102.45 rows=1 width=751) (actual
time=1509.043..1509.046 rows=1 loops=1)
   ->  Sort  (cost=102.45..102.45 rows=1 width=751) (actual
time=1509.040..1509.040 rows=1 loops=1)
         Sort Key: id
         Sort Method:  quicksort  Memory: 27kB
         ->  Bitmap Heap Scan on ftsbody  (cost=100.42..102.44 rows=1
width=751) (actual time=1509.018..1509.020 rows=1 loops=1)
               Recheck Cond: (ftsbody_body_fts @@ to_tsquery('TERM1 &
TERM2 & TERM3 & TERM4 & TERM5'::text))
               ->  Bitmap Index Scan on ftsbody_tfs_idx
(cost=0.00..100.42 rows=1 width=0) (actual time=1508.998..1508.998
rows=1 loops=1)
                     Index Cond: (ftsbody_body_fts @@ to_tsquery('TERM1
& TERM2 & TERM3 & TERM4 & TERM5'::text))
 Total runtime: 1509.109 ms
(9 rows)

Can (perhaps more readable) be found at http://krogh.cc/~jesper/test.out

Can this be optimized? (I cannot really prevent users from typing stuff
in that are common).

-- 
Jesper

Responses

pgsql-performance by date

Next:From: Kevin BuckhamDate: 2009-10-22 18:50:44
Subject: Table Clustering & Time Range Queries
Previous:From: Robert HaasDate: 2009-10-22 13:26:56
Subject: Re: optimizing query with multiple aggregates

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group