Re: Queryplan within FTS/GIN index -search.

From: Jesper Krogh <jesper(at)krogh(dot)cc>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: Queryplan within FTS/GIN index -search.
Date: 2009-10-30 19:46:37
Message-ID: 4AEB429D.5000809@krogh.cc
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi.

I've now got a test-set that can reproduce the problem where the two
fully equivalent queries (
body_fts @@ to_tsquery("commonterm & nonexistingterm")
and
body_fts @@ to_tsquery("coomonterm") AND body_fts @@
to_tsquery("nonexistingterm")

give a difference of x300 in execution time. (grows with
document-base-size).

this can now be reproduced using:

* http://krogh.cc/~jesper/fts-queryplan.pl and
http://krogh.cc/~jesper/words.txt

It build up a table with 200.000 documents where "commonterm" exists in
all of them. "nonexistingterm" is in 0.

To get the query-planner get a "sane" query I need to do a:
ftstest# set enable_seqscan=off

Then:
ftstest=# explain analyze select id from ftstest where body_fts @@
to_tsquery('nonexistingterm & commonterm');
QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on ftstest (cost=5563.09..7230.93 rows=1000 width=4)
(actual time=30.861..30.861 rows=0 loops=1)
Recheck Cond: (body_fts @@ to_tsquery('nonexistingterm &
commonterm'::text))
-> Bitmap Index Scan on ftstest_gin_idx (cost=0.00..5562.84
rows=1000 width=0) (actual time=30.856..30.856 rows=0 loops=1)
Index Cond: (body_fts @@ to_tsquery('nonexistingterm &
commonterm'::text))
Total runtime: 30.907 ms
(5 rows)

ftstest=# explain analyze select id from ftstest where body_fts @@
to_tsquery('nonexistingterm') and body_fts @@ to_tsquery('commonterm');
QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on ftstest (cost=5565.59..7238.43 rows=1000 width=4)
(actual time=0.059..0.059 rows=0 loops=1)
Recheck Cond: ((body_fts @@ to_tsquery('nonexistingterm'::text)) AND
(body_fts @@ to_tsquery('commonterm'::text)))
-> Bitmap Index Scan on ftstest_gin_idx (cost=0.00..5565.34
rows=1000 width=0) (actual time=0.057..0.057 rows=0 loops=1)
Index Cond: ((body_fts @@ to_tsquery('nonexistingterm'::text))
AND (body_fts @@ to_tsquery('commonterm'::text)))
Total runtime: 0.111 ms
(5 rows)

Run repeatedly to get a full memory recident dataset.

In this situation the former query end up being 300x slower than the
latter allthough they are fully equivalent.

--
Jesper

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Anj Adu 2009-10-30 19:53:00 Re: database size growing continously
Previous Message Jeremy Harris 2009-10-30 18:57:04 Re: database size growing continously