Re: Query Optimization

From: "Jim C(dot) Nasby" <jim(at)nasby(dot)net>
To: Reinhard Vicinus <rvicinus(at)rjm(dot)de>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Query Optimization
Date: 2007-02-20 16:45:20
Message-ID: 20070220164520.GR19527@nasby.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

It's not necessarily the join order that's an issue; it could also be
due to the merge join that it does in the first case. I've also run into
situations where the cost estimate for a merge join is way off the mark.

Rather than forcing the join order, you might try setting
enable_mergejoin=false.

On Mon, Feb 19, 2007 at 06:03:22PM +0100, Reinhard Vicinus wrote:
> PostgreSQL version: 8.1.6
> OS: Debian etch
>
> The following query needs a lot of time because the query planner
> reorders the joins:
>
> select m.message_idnr, v.headervalue, n.headername from dbmail_messages m
> join dbmail_headervalue v ON v.physmessage_id=m.physmessage_id
> join dbmail_headername n ON v.headername_id=n.id
> where m.mailbox_idnr = 1022 AND message_idnr BETWEEN 698928 AND 1496874
> and lower(n.headername) IN
> ('from','to','cc','subject','date','message-id',
> 'priority','x-priority','references','newsgroups','in-reply-to',
> 'content-type','x-spam-status','x-spam-flag');
>
> If I prevent the query planner from reordering the joins with 'set
> join_collapse_limit=1;' then the same query is faster. At the end of the
> Mail is the output of a explain analyze for both cases.
>
> The statistics of the database are updated each night. Is there an error
> (in the statistical data) which is responsible for the reordering of the
> joins? And if not are there other alternatives for preventing join
> reordering?
>
> Thanks
> Reinhard
>
>
>
> Explain analyze with set join_collapse_limit=8:
>
> Merge Join (cost=388657.62..391332.20 rows=821 width=127) (actual
> time=82677.950..89103.192 rows=2699 loops=1)
> Merge Cond: ("outer".physmessage_id = "inner".physmessage_id)
> -> Sort (cost=2901.03..2902.61 rows=632 width=16) (actual
> time=247.238..247.578 rows=373 loops=1)
> Sort Key: m.physmessage_id
> -> Bitmap Heap Scan on dbmail_messages m (cost=9.16..2871.63
> rows=632 width=16) (actual time=38.072..246.509 rows=373 loops=1)
> Recheck Cond: (mailbox_idnr = 1022)
> Filter: ((message_idnr >= 698928) AND (message_idnr <=
> 1496874))
> -> Bitmap Index Scan on dbmail_messages_8
> (cost=0.00..9.16 rows=902 width=0) (actual time=25.561..25.561 rows=615
> loops=1)
> Index Cond: (mailbox_idnr = 1022)
> -> Sort (cost=385756.58..387089.35 rows=533108 width=127) (actual
> time=80156.731..85760.186 rows=3278076 loops=1)
> Sort Key: v.physmessage_id
> -> Hash Join (cost=51.00..285787.17 rows=533108 width=127)
> (actual time=34.519..28260.855 rows=3370242 loops=1)
> Hash Cond: ("outer".headername_id = "inner".id)
> -> Seq Scan on dbmail_headervalue v
> (cost=0.00..241200.39 rows=7840939 width=115) (actual
> time=0.006..16844.479 rows=7854485 loops=1)
> -> Hash (cost=50.72..50.72 rows=113 width=28) (actual
> time=34.493..34.493 rows=35 loops=1)
> -> Bitmap Heap Scan on dbmail_headername n
> (cost=28.44..50.72 rows=113 width=28) (actual time=11.796..34.437
> rows=35 loops=1)
> Recheck Cond: ((lower((headername)::text) =
> 'from'::text) OR (lower((headername)::text) = 'to'::text) OR
> (lower((headername)::text) = 'cc'::text) OR (lower((headername)::text) =
> 'subject'::text) OR (lower((headername)::text) = 'date'::text) OR
> (lower((headername)::text) = 'message-id'::text) OR
> (lower((headername)::text) = 'priority'::text) OR
> (lower((headername)::text) = 'x-priority'::text) OR
> (lower((headername)::text) = 'references'::text) OR
> (lower((headername)::text) = 'newsgroups'::text) OR
> (lower((headername)::text) = 'in-reply-to'::text) OR
> (lower((headername)::text) = 'content-type'::text) OR
> (lower((headername)::text) = 'x-spam-status'::text) OR (lower((hea
> dername)::text) = 'x-spam-flag'::text))
> -> BitmapOr (cost=28.44..28.44 rows=116
> width=0) (actual time=11.786..11.786 rows=0 loops=1)
> -> Bitmap Index Scan on
> dbmail_headername_lower_headername (cost=0.00..2.03 rows=8 width=0)
> (actual time=0.037..0.037 rows=3 loops=1)
> Index Cond:
> (lower((headername)::text) = 'from'::text)
> -> Bitmap Index Scan on
> dbmail_headername_lower_headername (cost=0.00..2.03 rows=8 width=0)
> (actual time=0.013..0.013 rows=3 loops=1)
> Index Cond:
> (lower((headername)::text) = 'to'::text)
> -> Bitmap Index Scan on
> dbmail_headername_lower_headername (cost=0.00..2.03 rows=8 width=0)
> (actual time=0.013..0.013 rows=3 loops=1)
> Index Cond:
> (lower((headername)::text) = 'cc'::text)
> -> Bitmap Index Scan on
> dbmail_headername_lower_headername (cost=0.00..2.03 rows=8 width=0)
> (actual time=0.014..0.014 rows=3 loops=1)
> Index Cond:
> (lower((headername)::text) = 'subject'::text)
> -> Bitmap Index Scan on
> dbmail_headername_lower_headername (cost=0.00..2.03 rows=8 width=0)
> (actual time=0.014..0.014 rows=3 loops=1)
> Index Cond:
> (lower((headername)::text) = 'date'::text)
> -> Bitmap Index Scan on
> dbmail_headername_lower_headername (cost=0.00..2.03 rows=8 width=0)
> (actual time=0.019..0.019 rows=4 loops=1)
> Index Cond:
> (lower((headername)::text) = 'message-id'::text)
> -> Bitmap Index Scan on
> dbmail_headername_lower_headername (cost=0.00..2.03 rows=8 width=0)
> (actual time=0.012..0.012 rows=2 loops=1)
> Index Cond:
> (lower((headername)::text) = 'priority'::text)
> -> Bitmap Index Scan on
> dbmail_headername_lower_headername (cost=0.00..2.03 rows=8 width=0)
> (actual time=0.032..0.032 rows=4 loops=1)
> Index Cond:
> (lower((headername)::text) = 'x-priority'::text)
> -> Bitmap Index Scan on
> dbmail_headername_lower_headername (cost=0.00..2.03 rows=8 width=0)
> (actual time=0.015..0.015 rows=1 loops=1)
> Index Cond:
> (lower((headername)::text) = 'references'::text)
> -> Bitmap Index Scan on
> dbmail_headername_lower_headername (cost=0.00..2.03 rows=8 width=0)
> (actual time=0.009..0.009 rows=0 loops=1)
> Index Cond:
> (lower((headername)::text) = 'newsgroups'::text)
> -> Bitmap Index Scan on
> dbmail_headername_lower_headername (cost=0.00..2.03 rows=8 width=0)
> (actual time=0.014..0.014 rows=3 loops=1)
> Index Cond:
> (lower((headername)::text) = 'in-reply-to'::text)
> -> Bitmap Index Scan on
> dbmail_headername_lower_headername (cost=0.00..2.03 rows=8 width=0)
> (actual time=0.013..0.013 rows=1 loops=1)
> Index Cond:
> (lower((headername)::text) = 'content-type'::text)
> -> Bitmap Index Scan on
> dbmail_headername_lower_headername (cost=0.00..2.03 rows=8 width=0)
> (actual time=11.549..11.549 rows=2 loops=1)
> Index Cond:
> (lower((headername)::text) = 'x-spam-status'::text)
> -> Bitmap Index Scan on
> dbmail_headername_lower_headername (cost=0.00..2.03 rows=8 width=0)
> (actual time=0.019..0.019 rows=3 loops=1)
> Index Cond:
> (lower((headername)::text) = 'x-spam-flag'::text)
> Total runtime: 89277.937 ms
> (47 rows)
>
>
>
> Explain analyze with set join_collapse_limit=1:
>
> Hash Join (cost=51.00..1607155.00 rows=821 width=127) (actual
> time=14.640..47.851 rows=2699 loops=1)
> Hash Cond: ("outer".headername_id = "inner".id)
> -> Nested Loop (cost=0.00..1607035.43 rows=12071 width=115) (actual
> time=0.085..25.057 rows=7025 loops=1)
> -> Index Scan using dbmail_messages_mailbox_idx on
> dbmail_messages m (cost=0.00..3515.08 rows=632 width=16) (actual
> time=0.064..1.070 rows=373 loops=1)
> Index Cond: (mailbox_idnr = 1022)
> Filter: ((message_idnr >= 698928) AND (message_idnr <=
> 1496874))
> -> Index Scan using dbmail_headervalue_physmsg_id on
> dbmail_headervalue v (cost=0.00..2526.34 rows=870 width=115) (actual
> time=0.010..0.035 rows=19 loops=373)
> Index Cond: (v.physmessage_id = "outer".physmessage_id)
> -> Hash (cost=50.72..50.72 rows=113 width=28) (actual
> time=14.540..14.540 rows=35 loops=1)
> -> Bitmap Heap Scan on dbmail_headername n (cost=28.44..50.72
> rows=113 width=28) (actual time=14.429..14.492 rows=35 loops=1)
> Recheck Cond: ((lower((headername)::text) = 'from'::text)
> OR (lower((headername)::text) = 'to'::text) OR
> (lower((headername)::text) = 'cc'::text) OR (lower((headername)::text) =
> 'subject'::text) OR (lower((headername)::text) = 'date'::text) OR
> (lower((headername)::text) = 'message-id'::text) OR
> (lower((headername)::text) = 'priority'::text) OR
> (lower((headername)::text) = 'x-priority'::text) OR
> (lower((headername)::text) = 'references'::text) OR
> (lower((headername)::text) = 'newsgroups'::text) OR
> (lower((headername)::text) = 'in-reply-to'::text) OR
> (lower((headername)::text) = 'content-type'::text) OR
> (lower((headername)::text) = 'x-spam-status'::text) OR
> (lower((headername)::text) = 'x-spam-flag'::text))
> -> BitmapOr (cost=28.44..28.44 rows=116 width=0)
> (actual time=14.418..14.418 rows=0 loops=1)
> -> Bitmap Index Scan on
> dbmail_headername_lower_headername (cost=0.00..2.03 rows=8 width=0)
> (actual time=14.197..14.197 rows=3 loops=1)
> Index Cond: (lower((headername)::text) =
> 'from'::text)
> -> Bitmap Index Scan on
> dbmail_headername_lower_headername (cost=0.00..2.03 rows=8 width=0)
> (actual time=0.015..0.015 rows=3 loops=1)
> Index Cond: (lower((headername)::text) =
> 'to'::text)
> -> Bitmap Index Scan on
> dbmail_headername_lower_headername (cost=0.00..2.03 rows=8 width=0)
> (actual time=0.012..0.012 rows=3 loops=1)
> Index Cond: (lower((headername)::text) =
> 'cc'::text)
> -> Bitmap Index Scan on
> dbmail_headername_lower_headername (cost=0.00..2.03 rows=8 width=0)
> (actual time=0.013..0.013 rows=3 loops=1)
> Index Cond: (lower((headername)::text) =
> 'subject'::text)
> -> Bitmap Index Scan on
> dbmail_headername_lower_headername (cost=0.00..2.03 rows=8 width=0)
> (actual time=0.014..0.014 rows=3 loops=1)
> Index Cond: (lower((headername)::text) =
> 'date'::text)
> -> Bitmap Index Scan on
> dbmail_headername_lower_headername (cost=0.00..2.03 rows=8 width=0)
> (actual time=0.019..0.019 rows=4 loops=1)
> Index Cond: (lower((headername)::text) =
> 'message-id'::text)
> -> Bitmap Index Scan on
> dbmail_headername_lower_headername (cost=0.00..2.03 rows=8 width=0)
> (actual time=0.011..0.011 rows=2 loops=1)
> Index Cond: (lower((headername)::text) =
> 'priority'::text)
> -> Bitmap Index Scan on
> dbmail_headername_lower_headername (cost=0.00..2.03 rows=8 width=0)
> (actual time=0.031..0.031 rows=4 loops=1)
> Index Cond: (lower((headername)::text) =
> 'x-priority'::text)
> -> Bitmap Index Scan on
> dbmail_headername_lower_headername (cost=0.00..2.03 rows=8 width=0)
> (actual time=0.013..0.013 rows=1 loops=1)
> Index Cond: (lower((headername)::text) =
> 'references'::text)
> -> Bitmap Index Scan on
> dbmail_headername_lower_headername (cost=0.00..2.03 rows=8 width=0)
> (actual time=0.008..0.008 rows=0 loops=1)
> Index Cond: (lower((headername)::text) =
> 'newsgroups'::text)
> -> Bitmap Index Scan on
> dbmail_headername_lower_headername (cost=0.00..2.03 rows=8 width=0)
> (actual time=0.014..0.014 rows=3 loops=1)
> Index Cond: (lower((headername)::text) =
> 'in-reply-to'::text)
> -> Bitmap Index Scan on
> dbmail_headername_lower_headername (cost=0.00..2.03 rows=8 width=0)
> (actual time=0.012..0.012 rows=1 loops=1)
> Index Cond: (lower((headername)::text) =
> 'content-type'::text)
> -> Bitmap Index Scan on
> dbmail_headername_lower_headername (cost=0.00..2.03 rows=8 width=0)
> (actual time=0.028..0.028 rows=2 loops=1)
> Index Cond: (lower((headername)::text) =
> 'x-spam-status'::text)
> -> Bitmap Index Scan on
> dbmail_headername_lower_headername (cost=0.00..2.03 rows=8 width=0)
> (actual time=0.018..0.018 rows=3 loops=1)
> Index Cond: (lower((headername)::text) =
> 'x-spam-flag'::text)
> Total runtime: 49.634 ms
> (41 rows)
>
> --
> Reinhard Vicinus
> rjm business solutions GmbH
> Sperlingweg 3,
> 68623 Lampertheim
> Tel. 06206 9513084
> Fax 06206 910315
> --
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: In versions below 8.0, the planner will ignore your desire to
> choose an index scan if your joining column's datatypes do not
> match
>

--
Jim Nasby jim(at)nasby(dot)net
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Glenn Sullivan 2007-02-20 23:21:52 SELECT performance problem
Previous Message Teodor Sigaev 2007-02-20 11:15:27 Re: Having performance problems with TSearch2