10x rowcount mis-estimation favouring merge over nestloop

From: Abhijit Menon-Sen <ams(at)oryx(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Cc: arnt(at)oryx(dot)com
Subject: 10x rowcount mis-estimation favouring merge over nestloop
Date: 2006-11-10 05:12:45
Message-ID: 20061110051245.GA1928@penne.toroid.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

I'm executing the following query:

select
hf.mailbox,hf.uid,hf.position,hf.part,hf.field,hf.value,
af.address,a.name,a.localpart,a.domain
from
header_fields hf
left join address_fields af
using ( mailbox, uid, position, part, field )
left join addresses a
on (af.address=a.id)
where
hf.field<=12 and (hf.part!='' or hf.value ilike '%,%') ;

The header_fields table contains 13.5M rows, of which only ~250K match
the where condition. I created an index like this:

create index hffpv on header_fields(field)
where field<=12 and (part!='' or value ilike '%,%')

By default, explain analyse shows me a plan like this:

Hash Left Join (cost=1225503.02..1506125.88 rows=2077771 width=143) (actual time=106467.431..117902.397 rows=1113355 loops=1)
Hash Cond: ("outer".address = "inner".id)
-> Merge Left Join (cost=1211354.65..1288896.97 rows=2077771 width=91) (actual time=104792.505..110648.253 rows=1113355 loops=1)
Merge Cond: (("outer".field = "inner".field) AND ("outer".part = "inner".part) AND ("outer"."position" = "inner"."position") AND ("outer".uid = "inner".uid) AND ("outer".mailbox = "inner".mailbox))
-> Sort (cost=665399.78..670594.21 rows=2077771 width=87) (actual time=39463.784..39724.772 rows=264180 loops=1)
Sort Key: hf.field, hf.part, hf."position", hf.uid, hf.mailbox
-> Bitmap Heap Scan on header_fields hf (cost=1505.63..325237.46 rows=2077771 width=87) (actual time=3495.308..33767.229 rows=264180 loops=1)
Recheck Cond: ((field <= 12) AND ((part <> ''::text) OR (value ~~* '%,%'::text)))
-> Bitmap Index Scan on hffpv (cost=0.00..1505.63 rows=2077771 width=0) (actual time=3410.069..3410.069 rows=264180 loops=1)
Index Cond: (field <= 12)
-> Sort (cost=545954.87..553141.07 rows=2874480 width=24) (actual time=65328.437..67437.846 rows=2874230 loops=1)
Sort Key: af.field, af.part, af."position", af.uid, af.mailbox
-> Seq Scan on address_fields af (cost=0.00..163548.00 rows=2874480 width=24) (actual time=12.434..4076.694 rows=2874230 loops=1)
-> Hash (cost=11714.35..11714.35 rows=190807 width=56) (actual time=1670.629..1670.629 rows=190807 loops=1)
-> Seq Scan on addresses a (cost=0.00..11714.35 rows=190807 width=56) (actual time=39.944..1398.897 rows=190807 loops=1)
Total runtime: 118381.608 ms

Note the 2M estimated rowcount in the bitmap index scan on header_fields
vs. the actual number (264180). That mis-estimation also causes it to do
a sequential scan of address_fields, though there's a usable index. If I
set both enable_mergejoin and enable_seqscan to false, I get a plan like
the following:

Hash Left Join (cost=8796.82..72064677.06 rows=2077771 width=143) (actual time=4400.706..58110.697 rows=1113355 loops=1)
Hash Cond: ("outer".address = "inner".id)
-> Nested Loop Left Join (cost=1505.63..71937416.17 rows=2077771 width=91) (actual time=3486.238..52351.567 rows=1113355 loops=1)
Join Filter: (("outer"."position" = "inner"."position") AND ("outer".part = "inner".part) AND ("outer".field = "inner".field))
-> Bitmap Heap Scan on header_fields hf (cost=1505.63..242126.62 rows=2077771 width=87) (actual time=3478.202..39181.477 rows=264180 loops=1)
Recheck Cond: ((field <= 12) AND ((part <> ''::text) OR (value ~~* '%,%'::text)))
-> Bitmap Index Scan on hffpv (cost=0.00..1505.63 rows=2077771 width=0) (actual time=3393.949..3393.949 rows=264180 loops=1)
Index Cond: (field <= 12)
-> Index Scan using af_mu on address_fields af (cost=0.00..34.26 rows=11 width=24) (actual time=0.028..0.040 rows=7 loops=264180)
Index Cond: (("outer".mailbox = af.mailbox) AND ("outer".uid = af.uid))
-> Hash (cost=4857.17..4857.17 rows=190807 width=56) (actual time=764.337..764.337 rows=190807 loops=1)
-> Index Scan using addresses_pkey on addresses a (cost=0.00..4857.17 rows=190807 width=56) (actual time=29.381..484.826 rows=190807 loops=1)
Total runtime: 58459.624 ms

Which looks like a considerably nicer plan (but still features the wild
mis-estimation, though the index has approximately the right rowcount).
I tried increasing the statistics target on header_fields.field, part,
and value to 100, but the estimate always hovers around the 2M mark.

Does anyone have any ideas about what's wrong, and how to fix it?

Thanks.

-- ams

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2006-11-10 06:15:24 Re: 10x rowcount mis-estimation favouring merge over nestloop
Previous Message Joshua D. Drake 2006-11-10 02:19:11 Re: Keeping processes open for re-use