The cost of visibillity testing? (gin-search)

From: Jesper Krogh <jesper(at)krogh(dot)cc>
To: pgsql-hackers(at)postgresql(dot)org
Subject: The cost of visibillity testing? (gin-search)
Date: 2010-12-21 19:25:16
Message-ID: 4D10FF1C.9030203@krogh.cc
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Hackers.

I have a feeling that GIN is "cheating" on the visibillity checks:

test=# set enable_seqscan = off;
SET
Time: 0.129 ms
test=# select count(id) from fts_test where fts @@ to_tsquery('core');
count
--------
158827
(1 row)

Time: 95.530 ms
test=# explain select count(id) from fts_test where fts @@
to_tsquery('core');
QUERY PLAN
----------------------------------------------------------------------------------------------
Aggregate (cost=211571.52..211571.53 rows=1 width=4)
-> Bitmap Heap Scan on fts_test (cost=134925.95..211174.01
rows=159004 width=4)
Recheck Cond: (fts @@ to_tsquery('core'::text))
-> Bitmap Index Scan on fts_idx (cost=0.00..134886.20
rows=159004 width=0)
Index Cond: (fts @@ to_tsquery('core'::text))
(5 rows)

Time: 0.609 ms

test=# select count(id) from fts_test;
count
--------
168556
(1 row)

Time: 164.655 ms

test=# explain select count(id) from fts_test;
QUERY PLAN
------------------------------------------------------------------------------------------------
Aggregate (cost=10000075969.95..10000075969.96 rows=1 width=4)
-> Seq Scan on fts_test (cost=10000000000.00..10000075548.56
rows=168556 width=4)
(2 rows)

Time: 0.338 ms

This is run multiple times for both queries and the seqscan of the table
is consistently about 1.8 times more expensive than the fts-scan.
This is all on a fully memory cached dataset.

The first query should have the cost of the GIN-search +
visibillity-test of 158K tuples,
the latter should have the cost of visibillity-testing 168K tuples. If
we set the cost
of actually searching GIN to 0 then the gin-search - visibillity costs:
95/158000 0.000373ms/tuple
where the seq-scan case costs close to 0.001ms/tuple (close to 3 times
as much).

So I have a strong feeling that GIN is cheating on the visibillity tests
otherwise I have problems imagining how it ever can become faster to execute
than the seq_scan of the table.

Or is a Bitmap Heap Scan simply 3 times faster than a Seq-scan for
visibillity-testing?

What have I missed in the logic?

Thanks.

--
Jesper

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2010-12-21 19:33:04 Re: SQL/MED - core functionality
Previous Message Eric Ridge 2010-12-21 18:45:28 Re: bug in SignalSomeChildren