Re: How does filter order relate to query optimization?

From: Andrew Edson <cheighlund(at)yahoo(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: How does filter order relate to query optimization?
Date: 2007-02-27 19:24:38
Message-ID: 30147.25422.qm@web34208.mail.mud.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

I have now, over two of our setups, and I'm getting another, er, interesting problem. Same statement, same data, wildly different times. One's taking nearly half an hour, the other's ready within a few minutes. It's a rather large database, so I'm not surprised at a little delay (although shortening that as much as possible is the goal), but still...

Statement in question.
select substring(ttrans.tran_dt, 1, 10) as tran_dt, ttrans.dist_id as dist_id, ttrans.cntrct_id as cntrct_id, cntrt.cntrtyp_cd as cntrt_type, cntrt.actual_amt as cntrt_amt, acntrec.mth_reck as mth_reck, persn.frst_nm as fnm, persn.lst_nm as lnm from ttrans, cntrt, acntrec, persn, custm, addru where ttrans.tran_dt >= '2007-01-01' and ttrans.tran_dt < '2007-02-01' and ttrans.cntrct_id = cntrt.cntrct_id and cntrt.cntrct_seq = addru.cntrct_seq and addru.aunit_seq = acntrec.aunit_seq and cntrt.cntrtyp_cd = 260 and cntrt.clnt_seq = custm.clnt_seq and custm.person_seq = persn.person_seq and acntrec.cd_inst = 49 and acntrec.months = 49 and cntrt.dow_flg1 = 'NO' order by ttrans.dist_id asc, cntrt.cntrct_id asc, cntrt.cntrct_id asc, cntrt.cntrct_id asc, cntrt.cntrct_id asc;

Results: Slow system
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=169629.77..169629.78 rows=1 width=91) (actual time=1262832.907..1262833.259 rows=120 loops=1)
Sort Key: ttrans.dist_id, cntrt.cntrct_id
-> Nested Loop (cost=0.00..169629.76 rows=1 width=91) (actual time=18755.330..1262808.593 rows=120 loops=1)
Join Filter: (("inner".cntrct_id)::bpchar = "outer".cntrct_id)
-> Nested Loop (cost=0.00..18902.45 rows=1 width=68) (actual time=369.208..20016.454 rows=65 loops=1)
-> Nested Loop (cost=0.00..18897.73 rows=1 width=48) (actual time=347.437..15905.930 rows=65 loops=1)
-> Nested Loop (cost=0.00..18892.32 rows=1 width=48) (actual time=338.897..12678.319 rows=65 loops=1)
-> Nested Loop (cost=0.00..18875.15 rows=3 width=43) (actual time=317.317..8104.591 rows=68 loops=1)
-> Seq Scan on cntrt (cost=0.00..18857.61 rows=3 width=43) (actual time=317.181..4249.752 rows=68 loops=1)
Filter: (((cntrtyp_cd)::text = '260'::text) AND (dow_flg1 = 'NO'::bpchar))
-> Index Scan using fk_cntrct on addru (cost=0.00..5.83 rows=1 width=8) (actual time=56.661..56.666 rows=1 loops=68)
Index Cond: ("outer".cntrct_seq = addru.cntrct_seq)
-> Index Scan using fk_aunit on acntrec (cost=0.00..5.71 rows=1 width=13) (actual time=66.415..67.243 rows=1 loops=68)
Index Cond: ("outer".aunit_seq = acntrec.aunit_seq)
Filter: ((cd_inst = 49) AND ((months)::text = '49'::text))
-> Index Scan using "pkeyCUSTM" on custm (cost=0.00..5.39 rows=1 width=8) (actual time=49.633..49.638 rows=1 loops=65)
Index Cond: ("outer".clnt_seq = custm.clnt_seq)
-> Index Scan using "pkeyPERSN" on persn (cost=0.00..4.70 rows=1 width=28) (actual time=63.212..63.220 rows=1 loops=65)
Index Cond: ("outer".person_seq = persn.person_seq)
-> Seq Scan on ttrans (cost=0.00..149327.10 rows=112017 width=23) (actual time=163.610..18845.905 rows=86415 loops=65)
Filter: ((tran_dt >= '2007-01-01 00:00:00-06'::timestamp with time zone) AND (tran_dt < '2007-02-01 00:00:00-06'::timestamp with time zone))
Total runtime: 1262856.689 ms
(22 rows)

Results: 'Normal' system
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=177304.02..177304.03 rows=1 width=125) (actual time=271165.198..271165.432 rows=120 loops=1)
Sort Key: ttrans.dist_id, cntrt.cntrct_id
-> Nested Loop (cost=0.00..177304.01 rows=1 width=125) (actual time=21612.109..270938.877 rows=120 loops=1)
-> Nested Loop (cost=0.00..177299.40 rows=1 width=106) (actual time=21538.207..267696.943 rows=120 loops=1)
-> Nested Loop (cost=0.00..177294.64 rows=1 width=106) (actual time=21271.967..263316.202 rows=120 loops=1)
Join Filter: (("inner".cntrct_id)::bpchar = "outer".cntrct_id)
-> Nested Loop (cost=0.00..19221.50 rows=1 width=48) (actual time=2057.840..25089.891 rows=65 loops=1)
-> Nested Loop (cost=0.00..19210.31 rows=2 width=43) (actual time=1884.550..17108.249 rows=68 loops=1)
-> Seq Scan on cntrt (cost=0.00..19199.68 rows=2 width=43) (actual time=1590.328..8572.132 rows=68 loops=1)
Filter: (((cntrtyp_cd)::text = '260'::text) AND (dow_flg1 = 'NO'::bpchar))
-> Index Scan using fk_cntrct on addru (cost=0.00..5.30 rows=1 width=8) (actual time=125.508..125.513 rows=1 loops=68)
Index Cond: ("outer".cntrct_seq = addru.cntrct_seq)
-> Index Scan using fk_aunit on acntrec (cost=0.00..5.59 rows=1 width=13) (actual time=117.329..117.340 rows=1 loops=68)
Index Cond: ("outer".aunit_seq = acntrec.aunit_seq)
Filter: ((cd_inst = 49) AND ((months)::text = '49'::text))
-> Seq Scan on ttrans (cost=0.00..157710.93 rows=28976 width=58) (actual time=39.742..3530.494 rows=86415 loops=65)
Filter: ((tran_dt >= '2007-01-01 00:00:00-06'::timestamp with time zone) AND (tran_dt < '2007-02-01 00:00:00-06'::timestamp with time zone))
-> Index Scan using "pkeyCUSTM" on custm (cost=0.00..4.75 rows=1 width=8) (actual time=36.492..36.494 rows=1 loops=120)
Index Cond: ("outer".clnt_seq = custm.clnt_seq)
-> Index Scan using "pkeyPERSN" on persn (cost=0.00..4.59 rows=1 width=27) (actual time=26.973..26.981 rows=1 loops=120)
Index Cond: ("outer".person_seq = persn.person_seq)
Total runtime: 271175.640 ms
(22 rows)

Anybody ideas what might be causing the problems with the slowdown? The slow database is fed by slony logshipping from the 'normal' one, and both are (at least theoretically) getting vacuumed every night. What else might be causing this kind of slowdown problem?

Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
Andrew Edson writes:
> I'm working on a php project that's supposed to draw information from the DB for display, and I've been requested to speed up the display as much as possible. I'm drawing data from four tables, with an additional two that I have to 'bounce' through to match the keys together. Also, I've got five direct filtering requirements, four of the 'value = X' type and a date range.

> My question is this: Would shuffling the placement of the filtering
> requirements (t1.some_key = t2.some_key and t1.some_other_value = X,
> etc.) make a difference in processing speed for the response time?

No; certainly it will make no difference how you shuffle clauses that
involve different sets of tables. If you've got clauses that wind up in
the same "Filter:" condition in the generated plan, and some of them
involve expensive functions, it might be useful to shuffle the
expensive-to-evaluate ones to the end. But in most cases that's just
micro-optimization. Usually what you want to think about for something
like this is plan optimization, ie, what order are the tables joined in
and with what join methods. Have you looked at EXPLAIN ANALYZE results
for the query?

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster


---------------------------------
Bored stiff? Loosen up...
Download and play hundreds of games for free on Yahoo! Games.

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Dhaval Shah 2007-02-27 19:26:02 How often do I need to reindex tables?
Previous Message Anton Melser 2007-02-27 19:23:59 Re: stored procedure optimisation...