strange plan choice

From: Cousin Marc <mcousin(at)sigma(dot)fr>
To: pgsql-performance(at)postgresql(dot)org
Subject: strange plan choice
Date: 2008-02-19 15:27:58
Message-ID: 200802191627.58150.mcousin@sigma.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi,

This occurs on postgresql 8.2.5.

I'm a bit at loss with the plan chosen for a query :

The query is this one :

SELECT SULY_SAOEN.SAOEN_ID, SULY_SDCEN.SDCEN_REF, SULY_SDCEN.SDCEN_LIB, CSTD_UTI.UTI_NOM, CSTD_UTI.UTI_LIBC, SULY_SAOEN.SAOEN_DTDERNENVOI,
SULY_SDCEN.SDCEN_DTLIMAP, SULY_PFOUR.PFOUR_RAISON, SULY_SDCEN.PGTC_CODE
FROM SULY_SDCEN
inner join SULY_SDDEN on (SULY_SDCEN.SDCEN_ID=SULY_SDDEN.SDCEN_ID)
inner join SULY_SAOEN on (SULY_SAOEN.SDDEN_ID=SULY_SDDEN.SDDEN_ID)
inner join CSTD_UTI on (CSTD_UTI.UTI_CODE=SULY_SDDEN.SDDEN_RESPPROS)
inner join SULY_PFOUR on (SULY_PFOUR.PFOUR_ID=SULY_SAOEN.PFOUR_ID)
WHERE SULY_SDCEN.PGTC_CODE = '403' AND SULY_SDDEN.PBURE_ID IN (400001)
AND SULY_SAOEN.SAOEN_ID IN
(
SELECT TmpAoen.SAOEN_ID
FROM SULY_SAOPR TmpAopr
LEFT JOIN SULY_SOFPR TmpOfpr ON (TmpOfpr.SAOPR_ID = TmpAopr.SAOPR_ID),SULY_SAOEN TmpAoen
WHERE TmpAopr.SAOEN_ID= TmpAoen.SAOEN_ID AND (SOFPR_DEMCOMP = 1 OR (SAOPR_DTENV IS NOT NULL AND SAOPR_DTREPONSE IS NULL))
)

The plan I get is :

QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=65.91..2395.16 rows=6 width=142) (actual time=696.212..2566.996 rows=2 loops=1)
-> Nested Loop IN Join (cost=65.91..2391.95 rows=6 width=124) (actual time=696.189..2566.957 rows=2 loops=1)
Join Filter: (suly_saoen.saoen_id = tmpaopr.saoen_id)
-> Nested Loop (cost=10.84..34.21 rows=6 width=124) (actual time=0.233..0.617 rows=30 loops=1)
-> Nested Loop (cost=10.84..29.00 rows=2 width=108) (actual time=0.223..0.419 rows=2 loops=1)
-> Hash Join (cost=10.84..24.44 rows=2 width=87) (actual time=0.207..0.372 rows=2 loops=1)
Hash Cond: (suly_sdden.sdcen_id = suly_sdcen.sdcen_id)
-> Seq Scan on suly_sdden (cost=0.00..13.36 rows=58 width=27) (actual time=0.012..0.163 rows=58 loops=1)
Filter: (pbure_id = 400001)
-> Hash (cost=10.74..10.74 rows=8 width=76) (actual time=0.129..0.129 rows=8 loops=1)
-> Seq Scan on suly_sdcen (cost=0.00..10.74 rows=8 width=76) (actual time=0.017..0.113 rows=8 loops=1)
Filter: ((pgtc_code)::text = '403'::text)
-> Index Scan using pk_cstd_uti on cstd_uti (cost=0.00..2.27 rows=1 width=42) (actual time=0.015..0.017 rows=1 loops=2)
Index Cond: ((cstd_uti.uti_code)::text = (suly_sdden.sdden_resppros)::text)
-> Index Scan using ass_saoen_sdden_fk on suly_saoen (cost=0.00..2.54 rows=5 width=32) (actual time=0.007..0.049 rows=15 loops=2)
Index Cond: (suly_saoen.sdden_id = suly_sdden.sdden_id)
-> Hash Join (cost=55.07..2629.62 rows=8952 width=16) (actual time=0.119..82.680 rows=3202 loops=30)
Hash Cond: (tmpaopr.saoen_id = tmpaoen.saoen_id)
-> Merge Left Join (cost=0.00..2451.46 rows=8952 width=8) (actual time=0.027..76.229 rows=3202 loops=30)
Merge Cond: (tmpaopr.saopr_id = tmpofpr.saopr_id)
Filter: ((tmpofpr.sofpr_demcomp = 1::numeric) OR ((tmpaopr.saopr_dtenv IS NOT NULL) AND (tmpaopr.saopr_dtreponse IS NULL)))
-> Index Scan using pk_suly_saopr on suly_saopr tmpaopr (cost=0.00..1193.49 rows=15412 width=32) (actual time=0.012..19.431 rows=14401 loops=30)
-> Index Scan using ass_saopr_sofpr_fk on suly_sofpr tmpofpr (cost=0.00..998.90 rows=14718 width=16) (actual time=0.010..18.377 rows=13752 loops=30)
-> Hash (cost=38.92..38.92 rows=1292 width=8) (actual time=2.654..2.654 rows=1292 loops=1)
-> Seq Scan on suly_saoen tmpaoen (cost=0.00..38.92 rows=1292 width=8) (actual time=0.006..1.322 rows=1292 loops=1)
-> Index Scan using pk_suly_pfour on suly_pfour (cost=0.00..0.52 rows=1 width=34) (actual time=0.010..0.011 rows=1 loops=2)
Index Cond: (suly_pfour.pfour_id = suly_saoen.pfour_id)
Total runtime: 2567.225 ms
(28 lignes)

What I don't understand is the Nested Loop IN. If I understand correctly, the consequence is that the
bottom part (hash joins) is done 30 times ? Why not just once ?

If I remove SULY_SDCEN.PGTC_CODE = '403', the query becomes 25 times faster.

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=2766.40..2879.44 rows=175 width=142) (actual time=121.927..123.996 rows=120 loops=1)
-> Hash Join (cost=2766.40..2785.92 rows=175 width=124) (actual time=121.881..122.830 rows=120 loops=1)
Hash Cond: (tmpaopr.saoen_id = suly_saoen.saoen_id)
-> HashAggregate (cost=2652.00..2664.92 rows=1292 width=16) (actual time=114.968..115.306 rows=351 loops=1)
-> Hash Join (cost=55.07..2629.62 rows=8952 width=16) (actual time=2.694..111.293 rows=3424 loops=1)
Hash Cond: (tmpaopr.saoen_id = tmpaoen.saoen_id)
-> Merge Left Join (cost=0.00..2451.46 rows=8952 width=8) (actual time=0.038..101.836 rows=3424 loops=1)
Merge Cond: (tmpaopr.saopr_id = tmpofpr.saopr_id)
Filter: ((tmpofpr.sofpr_demcomp = 1::numeric) OR ((tmpaopr.saopr_dtenv IS NOT NULL) AND (tmpaopr.saopr_dtreponse IS NULL)))
-> Index Scan using pk_suly_saopr on suly_saopr tmpaopr (cost=0.00..1193.49 rows=15412 width=32) (actual time=0.016..30.360 rows=15412 loops=1)
-> Index Scan using ass_saopr_sofpr_fk on suly_sofpr tmpofpr (cost=0.00..998.90 rows=14718 width=16) (actual time=0.012..29.359 rows=14717 loops=1)
-> Hash (cost=38.92..38.92 rows=1292 width=8) (actual time=2.630..2.630 rows=1292 loops=1)
-> Seq Scan on suly_saoen tmpaoen (cost=0.00..38.92 rows=1292 width=8) (actual time=0.005..1.290 rows=1292 loops=1)
-> Hash (cost=112.21..112.21 rows=175 width=124) (actual time=6.892..6.892 rows=287 loops=1)
-> Hash Join (cost=66.70..112.21 rows=175 width=124) (actual time=3.557..6.413 rows=287 loops=1)
Hash Cond: (suly_saoen.sdden_id = suly_sdden.sdden_id)
-> Seq Scan on suly_saoen (cost=0.00..38.92 rows=1292 width=32) (actual time=0.010..1.272 rows=1292 loops=1)
-> Hash (cost=65.97..65.97 rows=58 width=108) (actual time=3.386..3.386 rows=58 loops=1)
-> Hash Join (cost=51.02..65.97 rows=58 width=108) (actual time=2.816..3.300 rows=58 loops=1)
Hash Cond: (suly_sdden.sdcen_id = suly_sdcen.sdcen_id)
-> Hash Join (cost=38.09..52.25 rows=58 width=48) (actual time=2.132..2.488 rows=58 loops=1)
Hash Cond: ((suly_sdden.sdden_resppros)::text = (cstd_uti.uti_code)::text)
-> Seq Scan on suly_sdden (cost=0.00..13.36 rows=58 width=27) (actual time=0.021..0.203 rows=58 loops=1)
Filter: (pbure_id = 400001)
-> Hash (cost=28.04..28.04 rows=804 width=42) (actual time=2.092..2.092 rows=804 loops=1)
-> Seq Scan on cstd_uti (cost=0.00..28.04 rows=804 width=42) (actual time=0.012..1.075 rows=804 loops=1)
-> Hash (cost=10.19..10.19 rows=219 width=76) (actual time=0.670..0.670 rows=219 loops=1)
-> Seq Scan on suly_sdcen (cost=0.00..10.19 rows=219 width=76) (actual time=0.027..0.370 rows=219 loops=1)
-> Index Scan using pk_suly_pfour on suly_pfour (cost=0.00..0.52 rows=1 width=34) (actual time=0.005..0.006 rows=1 loops=120)
Index Cond: (suly_pfour.pfour_id = suly_saoen.pfour_id)
Total runtime: 124.398 ms

I see that there is an estimation error on
"Nested Loop (cost=10.84..34.21 rows=6 width=124) (actual time=0.233..0.617 rows=30 loops=1)"
and that the costs of both queries is very close ...

But I don't see a good solution. Does anybody have advice on this one ?

Thanks a lot for your help.

Browse pgsql-performance by date

  From Date Subject
Next Message Andrew Sullivan 2008-02-19 15:48:01 Re: Disable WAL completely
Previous Message Matthew 2008-02-19 14:48:55 Re: Disable WAL completely