Question about LEFT JOIN and query plan

From: Kaloyan Iliev Iliev <kaloyan(at)digsys(dot)bg>
To: pgsql-performance(at)postgresql(dot)org
Subject: Question about LEFT JOIN and query plan
Date: 2010-09-03 16:16:40
Message-ID: 4C811F68.9060601@digsys.bg
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hello,
I have I query which behave strangely (according to me).
According to the first plan PG makes absolutely unnecessary seq scan on
tables "invoices" and "domeini" and etc.
I thing they should be access only if there are rows from the where. Why
the left join executes first?
Then I rewrite the query and move left joins to sub queries and the
result was great speed up.
But I thing it is more correctly to write the query with left joins. At
least the sub queries have similar parts which are now accessed twice.

So I will appreciate any suggestions how it is correct to write this
query and why the left join plan is so wrong.

SELECT version();

version
----------------------------------------------------------------------------------------------------------
PostgreSQL 8.4.4 on amd64-portbld-freebsd8.1, compiled by GCC cc (GCC)
4.2.1 20070719 [FreeBSD], 64-bit
(1 row)

Best regards,
Kaloyan Iliev

===============================ORIGINAL QUERY==============================
explain analyze SELECT
DD.debtid,
ADD.amount as saldo,
DOM.fqdn ||DT.descr as domain_fqdn,
S.descr_bg as service_descr_bg,
ADD.pno,
ADD.amount,
M.name_bg as measure_name_bg,
AC.ino,
I.idate
FROM debts_desc DD LEFT JOIN domeini DOM ON
(DD.domain_id = DOM.id)
LEFT
JOIN domain_type DT ON (DOM.domain_type_id = DT.id)
LEFT JOIN acc_debts
ADC ON (DD.debtid = ADC.debtid AND ADC.credit)
LEFT JOIN
acc_clients AC ON (AC.transact_no = ADC.transact_no AND NOT AC.credit)
LEFT JOIN
invoices I ON (AC.ino = I.ino AND I.istatus = 0),
acc_debts ADD,
services S,
measures M,
proforms P
WHERE DD.debtid = ADD.debtid
AND DD.measure_id = M.measure_id
AND DD.active
AND NOT DD.paid
AND DD.has_proform
AND NOT DD.storned
AND ADD.pno =
P.pno
AND NOT ADD.credit

AND
P.person1_id = 287294
AND
DD.serviceid = S.serviceid;


QUERY
PLAN

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=37503.47..47243.77 rows=1 width=110) (actual
time=1522.796..1522.796 rows=0 loops=1)
Join Filter: (dd.measure_id = m.measure_id)
-> Nested Loop (cost=37503.47..47242.45 rows=1 width=106) (actual
time=1522.794..1522.794 rows=0 loops=1)
Join Filter: (dd.serviceid = s.serviceid)
-> Hash Join (cost=37503.47..47239.46 rows=1 width=79)
(actual time=1522.791..1522.791 rows=0 loops=1)
Hash Cond: (dd.debtid = add.debtid)
-> Hash Left Join (cost=37475.95..47122.76 rows=23782
width=67) (actual time=1370.668..1521.629 rows=1037 loops=1)
Hash Cond: (dom.domain_type_id = dt.id)
-> Hash Left Join (cost=37474.12..46793.92
rows=23782 width=66) (actual time=1370.563..1519.302 rows=1037 loops=1)
Hash Cond: (dd.domain_id = dom.id)
-> Hash Left Join (cost=23487.71..30402.02
rows=23782 width=54) (actual time=556.587..636.320 rows=1037 loops=1)
Hash Cond: (ac.ino = i.ino)
-> Hash Left Join
(cost=8410.66..14259.11 rows=23782 width=50) (actual
time=318.180..387.026 rows=1037 loops=1)
Hash Cond: (adc.transact_no =
ac.transact_no)
-> Hash Left Join
(cost=4973.98..9903.69 rows=23782 width=50) (actual
time=175.979..234.068 rows=1037 loops=1)
Hash Cond: (dd.debtid =
adc.debtid)
-> Seq Scan on debts_desc
dd (cost=0.00..2866.52 rows=23782 width=46) (actual time=0.481..45.085
rows=1037 loops=1)
Filter: (active AND
(NOT paid) AND has_proform AND (NOT storned))
-> Hash
(cost=3942.08..3942.08 rows=62872 width=8) (actual time=175.410..175.410
rows=63157 loops=1)
-> Seq Scan on
acc_debts adc (cost=0.00..3942.08 rows=62872 width=8) (actual
time=0.097..102.172 rows=63157 loops=1)
Filter: credit
-> Hash (cost=2536.53..2536.53
rows=54812 width=8) (actual time=142.169..142.169 rows=54559 loops=1)
-> Seq Scan on acc_clients
ac (cost=0.00..2536.53 rows=54812 width=8) (actual time=0.019..78.736
rows=54559 loops=1)
Filter: (NOT credit)
-> Hash (cost=14181.02..14181.02
rows=54562 width=8) (actual time=238.380..238.380 rows=54559 loops=1)
-> Seq Scan on invoices i
(cost=0.00..14181.02 rows=54562 width=8) (actual time=0.029..170.761
rows=54559 loops=1)
Filter: (istatus = 0)
-> Hash (cost=8669.96..8669.96 rows=305796
width=16) (actual time=813.940..813.940 rows=305796 loops=1)
-> Seq Scan on domeini dom
(cost=0.00..8669.96 rows=305796 width=16) (actual time=0.015..419.684
rows=305796 loops=1)
-> Hash (cost=1.37..1.37 rows=37 width=9) (actual
time=0.087..0.087 rows=37 loops=1)
-> Seq Scan on domain_type dt
(cost=0.00..1.37 rows=37 width=9) (actual time=0.003..0.040 rows=37 loops=1)
-> Hash (cost=27.45..27.45 rows=5 width=16) (actual
time=0.078..0.078 rows=1 loops=1)
-> Nested Loop (cost=0.00..27.45 rows=5 width=16)
(actual time=0.067..0.073 rows=1 loops=1)
-> Index Scan using proforms_person1_id_idx
on proforms p (cost=0.00..10.62 rows=2 width=4) (actual
time=0.045..0.046 rows=1 loops=1)
Index Cond: (person1_id = 287294)
-> Index Scan using acc_debts_pno_idx on
acc_debts add (cost=0.00..8.38 rows=3 width=16) (actual
time=0.017..0.019 rows=1 loops=1)
Index Cond: (add.pno = p.pno)
Filter: (NOT add.credit)
-> Seq Scan on services s (cost=0.00..2.44 rows=44 width=31)
(never executed)
-> Seq Scan on measures m (cost=0.00..1.14 rows=14 width=8) (never
executed)
Total runtime: 1523.525 ms
(41 rows)

==================================================AFTER
REWRITE============================================

explain analyze SELECT
DD.debtid,
ADD.amount as saldo,
(SELECT DOM.fqdn ||DT.descr
FROM domeini DOM, domain_type DT
WHERE DOM.domain_type_id = DT.id
AND DD.domain_id = DOM.id) as
domain_fqdn,
S.descr_bg as service_descr_bg,
ADD.pno,
ADD.amount,
M.name_bg as measure_name_bg,
(SELECT AC.ino FROM acc_debts ACD,
acc_clients AC
WHERE ACD.debtid = ADD.debtid
AND ACD.credit
AND AC.transact_no =
ACD.transact_no
AND NOT AC.credit) as ino,
(SELECT I.idate FROM acc_debts ACD,
acc_clients AC,
invoices I
WHERE ACD.debtid = ADD.debtid
AND ACD.credit
AND AC.transact_no =
ACD.transact_no
AND NOT AC.credit
AND AC.ino = I.ino
AND I.istatus = 0) as idate
FROM debts_desc DD,
acc_debts ADD,
services S,
measures M,
proforms P
WHERE DD.debtid = ADD.debtid
AND DD.measure_id = M.measure_id
AND DD.active
AND NOT DD.paid
AND DD.has_proform
AND NOT DD.storned
AND ADD.pno = P.pno
AND NOT ADD.credit
AND P.person1_id = 287294
AND DD.serviceid = S.serviceid;


QUERY
PLAN

-------------------------------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=0.00..77.90 rows=1 width=93) (actual
time=0.047..0.047 rows=0 loops=1)
-> Nested Loop (cost=0.00..32.96 rows=1 width=66) (actual
time=0.045..0.045 rows=0 loops=1)
-> Nested Loop (cost=0.00..32.68 rows=1 width=62) (actual
time=0.043..0.043 rows=0 loops=1)
-> Nested Loop (cost=0.00..27.45 rows=5 width=16)
(actual time=0.026..0.031 rows=1 loops=1)
-> Index Scan using proforms_person1_id_idx on
proforms p (cost=0.00..10.62 rows=2 width=4) (actual time=0.013..0.014
rows=1 loops=1)
Index Cond: (person1_id = 287294)
-> Index Scan using acc_debts_pno_idx on acc_debts
add (cost=0.00..8.38 rows=3 width=16) (actual time=0.007..0.008 rows=1
loops=1)
Index Cond: (add.pno = p.pno)
Filter: (NOT add.credit)
-> Index Scan using debts_desc_pkey on debts_desc dd
(cost=0.00..1.03 rows=1 width=46) (actual time=0.007..0.007 rows=0 loops=1)
Index Cond: (dd.debtid = add.debtid)
Filter: (dd.active AND (NOT dd.paid) AND
dd.has_proform AND (NOT dd.storned))
-> Index Scan using measures_pkey on measures m
(cost=0.00..0.27 rows=1 width=8) (never executed)
Index Cond: (m.measure_id = dd.measure_id)
-> Index Scan using services_pkey on services s (cost=0.00..0.27
rows=1 width=31) (never executed)
Index Cond: (s.serviceid = dd.serviceid)
SubPlan 1
-> Hash Join (cost=8.31..9.84 rows=1 width=13) (never executed)
Hash Cond: (dt.id = dom.domain_type_id)
-> Seq Scan on domain_type dt (cost=0.00..1.37 rows=37
width=9) (never executed)
-> Hash (cost=8.30..8.30 rows=1 width=12) (never executed)
-> Index Scan using domeini_pkey on domeini dom
(cost=0.00..8.30 rows=1 width=12) (never executed)
Index Cond: ($0 = id)
SubPlan 2
-> Nested Loop (cost=0.00..16.63 rows=1 width=4) (never executed)
-> Index Scan using acc_debts_debtid_idx on acc_debts acd
(cost=0.00..8.33 rows=1 width=4) (never executed)
Index Cond: (debtid = $1)
Filter: credit
-> Index Scan using acc_clients_transact_no_uidx on
acc_clients ac (cost=0.00..8.28 rows=1 width=8) (never executed)
Index Cond: (ac.transact_no = acd.transact_no)
Filter: (NOT ac.credit)
SubPlan 3
-> Nested Loop (cost=0.00..18.19 rows=1 width=4) (never executed)
-> Nested Loop (cost=0.00..16.63 rows=1 width=4) (never
executed)
-> Index Scan using acc_debts_debtid_idx on acc_debts
acd (cost=0.00..8.33 rows=1 width=4) (never executed)
Index Cond: (debtid = $1)
Filter: credit
-> Index Scan using acc_clients_transact_no_uidx on
acc_clients ac (cost=0.00..8.28 rows=1 width=8) (never executed)
Index Cond: (ac.transact_no = acd.transact_no)
Filter: (NOT ac.credit)
-> Index Scan using invoices_ino_uidx on invoices i
(cost=0.00..1.55 rows=1 width=8) (never executed)
Index Cond: (i.ino = ac.ino)
Total runtime: 0.202 ms
(43 rows)

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Gerhard Wiesinger 2010-09-03 16:38:53 Re: Major performance problem after upgrade from 8.3 to 8.4
Previous Message Florian Weimer 2010-09-03 15:28:19 Re: Odd estimation issue with user-defined type