Quick Links

BUG #2007: Problem with multiple JOIN and long IN and bitmap index

From:	"A Gattiker" <agattik(at)gmail(dot)com>
To:	pgsql-bugs(at)postgresql(dot)org
Subject:	BUG #2007: Problem with multiple JOIN and long IN and bitmap index
Date:	2005-10-28 11:39:14
Message-ID:	20051028113914.B65F1F0C34@svr2.postgresql.org
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-bugs

The following bug has been logged online:

Bug reference: 2007
Logged by: A Gattiker
Email address: agattik(at)gmail(dot)com
PostgreSQL version: 8.1beta4
Operating system: alphaev68-dec-osf5.1a
Description: Problem with multiple JOIN and long IN and bitmap index
Details:

I have a huge table (100M rows) loaded from XML data that I use to make
lookups by going up and down the tree. To conveniently access text elements,
I have indexed them by md5 (from pgcrypto).

In 7.4 I had problems with this type of query (automatically generated by a
script). I would typically need to join 6-10 tables, but at the 4th to 6th
table it would suddenly stop planning to use the $parent_name index and do a
seqscan on name instead.

Table "xmldb.uniref90"
Column | Type | Modifiers
--------+----------+-----------
id | integer | not null
type | smallint |
name | text |
value | text |
parent | integer |
pos | integer |
Indexes:
"pk_uniref90" PRIMARY KEY, btree (id)
"i_uniref90$parent_name" btree (parent, name)
"i_uniref90$value" btree (digest(value, 'md5'::text))

explain SELECT * FROM uniref90 _x1
JOIN uniref90 _x2 ON ((_x1.parent = _x2.id))
JOIN uniref90 _x3 ON ((_x3.parent = _x2.id) AND (_x3.name IN
('representativeMember', 'member')))
JOIN uniref90 _x4 ON ((_x4.parent = _x3.id) AND (_x4.name =
'sequence'))
JOIN uniref90 _x5 ON ((_x5.parent = _x4.id) AND (_x5.name = 'length'))
JOIN uniref90 _x6 ON ((_x6.parent = _x3.id) AND (_x6.name =
'dbReference'))
WHERE (digest(_x1.value, 'md5') IN (digest('UniRef90_Q4J9I1', 'md5'),
digest('UniRef90_Q4H5E1', 'md5'))) AND (_x1.name = 'id')

I didn't mind it because I could rewrite the program to create a temporary
table at each step instead.

However in 8.1b4 neither strategy gives good results. The plan for the query
posted above is:

----------------------------------------------------------------------------
----------------------------------------------------------------------------
--------------------------------------------------------------------
Nested Loop (cost=2049792.49..3391544.88 rows=49 width=300)
-> Nested Loop (cost=2049792.49..3391396.61 rows=49 width=250)
-> Nested Loop (cost=2049792.49..3385589.99 rows=74 width=200)
-> Nested Loop (cost=2049792.49..3163685.61 rows=2825
width=150)
-> Hash Join (cost=2049792.49..2998941.67 rows=54353
width=100)
Hash Cond: ("outer".parent = "inner".id)
-> Bitmap Heap Scan on uniref90 _x5
(cost=583086.55..1439372.87 rows=2064026 width=50)
Recheck Cond: (name = 'length'::text)
-> Bitmap Index Scan on
"i_uniref90$parent_name" (cost=0.00..583086.55 rows=2064026 width=0)
Index Cond: (name = 'length'::text)
-> Hash (cost=1439372.87..1439372.87
rows=2064026 width=50)
-> Bitmap Heap Scan on uniref90 _x4
(cost=583086.55..1439372.87 rows=2064026 width=50)
Recheck Cond: (name =
'sequence'::text)
-> Bitmap Index Scan on
"i_uniref90$parent_name" (cost=0.00..583086.55 rows=2064026 width=0)
Index Cond: (name =
'sequence'::text)
-> Index Scan using pk_uniref90 on uniref90 _x3
(cost=0.00..3.02 rows=1 width=50)
Index Cond: ("outer".parent = _x3.id)
Filter: ((name = 'representativeMember'::text) OR
(name = 'member'::text))
-> Index Scan using "i_uniref90$parent_name" on uniref90 _x6
(cost=0.00..78.08 rows=38 width=50)
Index Cond: ((_x6.parent = "outer".id) AND (_x6.name =
'dbReference'::text))
-> Index Scan using "i_uniref90$parent_name" on uniref90 _x1
(cost=0.00..78.46 rows=1 width=50)
Index Cond: (("outer".parent = _x1.parent) AND (_x1.name =
'id'::text))
Filter: ((digest(value, 'md5'::text) =
E'\\217\\245\\200\\212\\017O\\315Q\\010\\003\\227N\\005\\202L\\303'::bytea)
OR (digest(value, 'md5'::text) =
E'@\\015\\363/3.\\376d}\\220\\177\\211\\2754C('::bytea))
-> Index Scan using pk_uniref90 on uniref90 _x2 (cost=0.00..3.01 rows=1
width=50)
Index Cond: ("outer".parent = _x2.id)

the first step is a Bitmap Index Scan on the second column of a composite
index, and the optimizer chooses to access the 4th table first instead of
the 1st, exactly like with 7.4.

The approach with temporary tables also now fails. Postgres 8.1 beta 4
behaves well with a IN () condition with a couple of items, but with 51
items in the join (I have not determined the breakpoint), it changes its
strategy.

explain analyze SELECT * FROM uniref90 WHERE (digest(value, 'md5') IN
(digest('UniRef90_Q4HC20', 'md5'), digest('UniRef90_Q60AK7', 'md5'),
digest('UniRef90_Q62G98', 'md5'))) AND (name = 'id');

QUERY PLAN

----------------------------------------------------------------------------
----------------------------------------------------------------------------
----------------------------------------------------------------------------
----------------------------------------------------------------------------
-----------------------------------
Bitmap Heap Scan on uniref90 (cost=462.98..186334.78 rows=1564 width=50)
(actual time=0.000..0.000 rows=3 loops=1)
Recheck Cond: ((digest(value, 'md5'::text) =
E'\\020\\003\\257\\353\\310\\033\\237\\361\\307\\261\\030\\252nG\\243_'::byt
ea) OR (digest(value, 'md5'::text) =
E'g\\005\\266\\365\\375s8!.\\215\\027\\022\\344\\350k\\267'::bytea) OR
(digest(value, 'md5'::text) =
E'h\\212\\245\\356\\227\\034\\356\\010I\\212\\247\\243No\\232\\006'::bytea))

Filter: (name = 'id'::text)
-> BitmapOr (cost=462.98..462.98 rows=59420 width=0) (actual
time=0.000..0.000 rows=0 loops=1)
-> Bitmap Index Scan on "i_uniref90$value" (cost=0.00..154.33
rows=19807 width=0) (actual time=0.000..0.000 rows=1 loops=1)
Index Cond: (digest(value, 'md5'::text) =
E'\\020\\003\\257\\353\\310\\033\\237\\361\\307\\261\\030\\252nG\\243_'::byt
ea)
-> Bitmap Index Scan on "i_uniref90$value" (cost=0.00..154.33
rows=19807 width=0) (actual time=0.000..0.000 rows=1 loops=1)
Index Cond: (digest(value, 'md5'::text) =
E'g\\005\\266\\365\\375s8!.\\215\\027\\022\\344\\350k\\267'::bytea)
-> Bitmap Index Scan on "i_uniref90$value" (cost=0.00..154.33
rows=19807 width=0) (actual time=0.000..0.000 rows=1 loops=1)
Index Cond: (digest(value, 'md5'::text) =
E'h\\212\\245\\356\\227\\034\\356\\010I\\212\\247\\243No\\232\\006'::bytea)
Total runtime: 0.000 ms
(11 rows)

Now explain the same query with 51 digest() in the IN():

Bitmap Heap Scan on uniref90 (cost=590803.09..686808.41 rows=25918
width=48)
Recheck Cond: (((digest(value, 'md5'::text) =
E'\\020\\003\\257\\353\\310\\033\\237\\361\\307\\261\\030\\252nG\\243_'::byt
ea) OR (digest(value, 'md5'[.......]25'::bytea)) AND (name = 'id'::text))
-> BitmapAnd (cost=590803.09..590803.09 rows=26079 width=0)
-> BitmapOr (cost=7716.29..7716.29 rows=990334 width=0)
-> Bitmap Index Scan on "i_uniref90$value"
(cost=0.00..154.33 rows=19807 width=0)
Index Cond: (digest(value, 'md5'::text) =
E'\\020\\003\\257\\353\\310\\033\\237\\361\\307\\261\\030\\252nG\\243_'::byt
ea)
-> Bitmap Index Scan on "i_uniref90$value"
(cost=0.00..154.33 rows=19807 width=0)
Index Cond: (digest(value, 'md5'::text) =
E'g\\005\\266\\365\\375s8!.\\215\\027\\022\\344\\350k\\267'::bytea)
-> Bitmap Index Scan on "i_uniref90$value"
(cost=0.00..154.33 rows=19807 width=0)
[...]
-> Bitmap Index Scan on "i_uniref90$value"
(cost=0.00..154.33 rows=19807 width=0)
Index Cond: (digest(value, 'md5'::text) =
E'\\3201M,m\\232\\034j^3rX\\024\\205)\\322'::bytea)
-> Bitmap Index Scan on "i_uniref90$value"
(cost=0.00..154.33 rows=19807 width=0)
Index Cond: (digest(value, 'md5'::text) =
E'mV)\\263\\036\\037`P\\3072\\001\\325\\032E\\036\\025'::bytea)
-> Bitmap Index Scan on "i_uniref90$parent_name"
(cost=0.00..583086.55 rows=2064026 width=0)
Index Cond: (name = 'id'::text)

It has again decided to use the second column of a composite index for a
bitmap scan! The query now takes forever.

With "set enable_bitmapscan to off" it's even worse:

Index Scan using "i_uniref90$parent_name" on uniref90 _x1
(cost=0.00..5192433.33 rows=25918 width=48)
Index Cond: (name = 'id'::text)
Filter: ((digest(value, 'md5'::text) =
E'\\020\\003\\257\\353\\310\\033\\237\\361\\307\\261\\030\\252nG\\243_'::byt
ea) OR (digest(value, 'md5'::text) = ...

I find these phenomena rather surprising.

Please keep up the great work.

Browse pgsql-bugs by date

	From	Date	Subject
Next Message	Antonio Pedro Lopes	2005-10-28 12:59:43	BUG #2008: IUSR (Internet Guest Account) - ADO Connection problems IUSR missing permissions.
Previous Message	Atanas Hristov	2005-10-28 06:53:03	BUG #2006: queryoptimizer and comparing a primary key of biginteger and a literal