BitmapScan mishaps

From: Listmail <lists(at)peufeu(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: BitmapScan mishaps
Date: 2007-04-03 15:01:56
Message-ID: op.tp7qpie5zcizji@apollo13
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general


Hello everyone !

I have this query :

annonces=> EXPLAIN ANALYZE SELECT * FROM annonces AS a WHERE
detect_time > CURRENT_TIMESTAMP - '7 DAY'::INTERVAL
AND detect_time >= '2006-10-30 16:17:45.064793'
AND vente
AND surface IS NOT NULL AND price IS NOT NULL
AND type_id IN
(1,12,24,17,18,19,20,33,35,50,51,7,52,4,13,41,14,16,26,28,43,53,15,29,30,31,45,32,34,46,47,6)
AND (a.zipcode IN (69001,69002,69003,69004,69005,69006,69007,69008,69009)
OR a.city_id IN (27595)
OR a.coords &&
'(45.74101689082,4.8371263505564),(45.75898310918,4.8628736494436)'::BOX) ;
QUERY
PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on annonces a (cost=1657.06..7145.98 rows=1177
width=691) (actual time=118.342..118.854 rows=194 loops=1)
Recheck Cond: (((vente AND (zipcode = ANY
('{69001,69002,69003,69004,69005,69006,69007,69008,69009}'::integer[])))
OR (vente AND (city_id = 27595)) OR (coords &&
'(45.75898310918,4.8628736494436),(45.74101689082,4.8371263505564)'::box))
AND (detect_time > (now() - '7 days'::interval)) AND (detect_time >=
'2006-10-30 16:17:45.064793'::timestamp without time zone))
Filter: (vente AND (surface IS NOT NULL) AND (price IS NOT NULL) AND
(type_id = ANY
('{1,12,24,17,18,19,20,33,35,50,51,7,52,4,13,41,14,16,26,28,43,53,15,29,30,31,45,32,34,46,47,6}'::integer[])))
-> BitmapAnd (cost=1657.06..1657.06 rows=2465 width=0) (actual
time=118.294..118.294 rows=0 loops=1)
-> BitmapOr (cost=133.83..133.83 rows=4368 width=0) (actual
time=2.903..2.903 rows=0 loops=1)
-> Bitmap Index Scan on annonces_zip (cost=0.00..56.54
rows=1825 width=0) (actual time=0.599..0.599 rows=1580 loops=1)
Index Cond: ((vente = true) AND (zipcode = ANY
('{69001,69002,69003,69004,69005,69006,69007,69008,69009}'::integer[])))
-> Bitmap Index Scan on annonces_city (cost=0.00..43.30
rows=1904 width=0) (actual time=0.464..0.464 rows=1575 loops=1)
Index Cond: ((vente = true) AND (city_id = 27595))
-> Bitmap Index Scan on annonces_coords (cost=0.00..33.10
rows=640 width=0) (actual time=1.837..1.837 rows=2166 loops=1)
Index Cond: (coords &&
'(45.75898310918,4.8628736494436),(45.74101689082,4.8371263505564)'::box)
-> Bitmap Index Scan on annonces_date (cost=0.00..1522.68
rows=72241 width=0) (actual time=114.930..114.930 rows=68022 loops=1)
Index Cond: ((detect_time > (now() - '7 days'::interval))
AND (detect_time >= '2006-10-30 16:17:45.064793'::timestamp without time
zone))
Total runtime: 119.000 ms
(14 lignes)

The interesting part is :

Bitmap Index Scan on annonces_date (cost=0.00..1522.68 rows=72241
width=0) (actual time=114.930..114.930 rows=68022 loops=1)

It bitmapscans about half the table...
I realized this index was actually useless for all my queries, so I
dropped it, and behold :

QUERY
PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on annonces a (cost=133.83..7583.77 rows=1176
width=691) (actual time=5.483..18.731 rows=194 loops=1)
Recheck Cond: ((vente AND (zipcode = ANY
('{69001,69002,69003,69004,69005,69006,69007,69008,69009}'::integer[])))
OR (vente AND (city_id = 27595)) OR (coords &&
'(45.75898310918,4.8628736494436),(45.74101689082,4.8371263505564)'::box))
Filter: ((detect_time > (now() - '7 days'::interval)) AND (detect_time
>= '2006-10-30 16:17:45.064793'::timestamp without time zone) AND vente
AND (surface IS NOT NULL) AND (price IS NOT NULL) AND (type_id = ANY
('{1,12,24,17,18,19,20,33,35,50,51,7,52,4,13,41,14,16,26,28,43,53,15,29,30,31,45,32,34,46,47,6}'::integer[])))
-> BitmapOr (cost=133.83..133.83 rows=4368 width=0) (actual
time=2.648..2.648 rows=0 loops=1)
-> Bitmap Index Scan on annonces_zip (cost=0.00..56.54
rows=1825 width=0) (actual time=0.505..0.505 rows=1580 loops=1)
Index Cond: ((vente = true) AND (zipcode = ANY
('{69001,69002,69003,69004,69005,69006,69007,69008,69009}'::integer[])))
-> Bitmap Index Scan on annonces_city (cost=0.00..43.30
rows=1904 width=0) (actual time=0.415..0.415 rows=1575 loops=1)
Index Cond: ((vente = true) AND (city_id = 27595))
-> Bitmap Index Scan on annonces_coords (cost=0.00..33.10
rows=640 width=0) (actual time=1.725..1.725 rows=2166 loops=1)
Index Cond: (coords &&
'(45.75898310918,4.8628736494436),(45.74101689082,4.8371263505564)'::box)
Total runtime: 18.859 ms

I guess saving 68022 rows of index scan is worth it !
Now 18 ms to extract the rows I want from that table (not huge, but still
about 70 megabytes) is what I call : nice.

Just thought it might be useful to some of you. I should have remembered
KISS !

BTW, shouldn't the planner think about this also ? Bitmap-scanning half a
table is likely to be slower than seq scan anyway...

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Jonathan Vanasco 2007-04-03 15:27:57 Re: inserting multiple values in version 8.1.5
Previous Message A. Kretschmer 2007-04-03 14:33:18 Re: inserting multiple values in version 8.1.5