Re: Wrong rows estimations with joins of CTEs slows queries by more than factor 500

From: Jian Guo <gjian(at)vmware(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Hans Buschmann <buschmann(at)nidsa(dot)net>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Cc: Zhenghua Lyu <zlyu(at)vmware(dot)com>
Subject: Re: Wrong rows estimations with joins of CTEs slows queries by more than factor 500
Date: 2023-08-21 08:16:12
Message-ID: SN6PR05MB5199C0095508555B7D8D0D23C41EA@SN6PR05MB5199.namprd05.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi hackers,

I found a new approach to fix this issue, which seems better, so I would like to post another version of the patch here. The origin patch made the assumption of the values of Vars from CTE must be unique, which could be very wrong. This patch examines variables for Vars inside CTE, which avoided the bad assumption, so the results could be much more accurate.

Regards,
Jian

________________________________
From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Sent: Monday, August 14, 2023 20:58
To: Jian Guo <gjian(at)vmware(dot)com>; Hans Buschmann <buschmann(at)nidsa(dot)net>; pgsql-hackers(at)lists(dot)postgresql(dot)org <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Wrong rows estimations with joins of CTEs slows queries by more than factor 500

!! External Email

Hi,

I haven't looked at the patch, but please add the patch to the next
commit fest (2023-09), so that we don't lose track of it.

See https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcommitfest.postgresql.org%2F&data=05%7C01%7Cgjian%40vmware.com%7C9d40e84af2c946f3517a08db9cc61ee2%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C638276146959658928%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=EUlMgo%2BU4Oi%2BWf0cS%2FKnTmhHzZrYzu26PzfxYnZIDFs%3D&reserved=0<https://commitfest.postgresql.org/>

regards

Tomas

On 8/14/23 13:12, Jian Guo wrote:
> Hi hackers,
>
> I have written a patch to add stats info for Vars in CTEs. With this
> patch, the join size estimation on the upper of CTE scans became more
> accurate.
>
> In the function |selfuncs.c:eqjoinsel| it uses the number of the
> distinct values of the two join variables to estimate join size, and in
> the function |selfuncs.c:get_variable_numdistinct| return a default
> value |DEFAULT_NUM_DISTINCT| (200 in Postgres and 1000 in Greenplum),
> with the default value, you can never expect a good plan.
>
> Thanks if anyone could give a review.
>
> Regards,
> Jian
>
> ------------------------------------------------------------------------
> *From:* Hans Buschmann <buschmann(at)nidsa(dot)net>
> *Sent:* Wednesday, February 8, 2023 21:55
> *To:* pgsql-hackers(at)lists(dot)postgresql(dot)org
> <pgsql-hackers(at)lists(dot)postgresql(dot)org>
> *Subject:* Wrong rows estimations with joins of CTEs slows queries by
> more than factor 500
>
>
> !! External Email
>
> During data refactoring of our Application I encountered $subject when
> joining 4 CTEs with left join or inner join.
>
>
> 1. Background
>
> PG 15.1 on Windows x64 (OS seems no to have no meening here)
>
>
> I try to collect data from 4 (analyzed) tables (up,li,in,ou) by grouping
> certain data (4 CTEs qup,qli,qin,qou)
>
> The grouping of the data in the CTEs gives estimated row counts of about
> 1000 (1 tenth of the real value) This is OK for estimation.
>
>
> These 4 CTEs are then used to combine the data by joining them.
>
>
> 2. Problem
>
> The 4 CTEs are joined by left joins as shown below:
>
>
> from qup
> left join qli on (qli.curr_season=qup.curr_season and
> qli.curr_code=qup.curr_code and qli.ibitmask>0 and
> cardinality(qli.mat_arr) <=8)
> left join qin on (qin.curr_season=qup.curr_season and
> qin.curr_code=qup.curr_code and qin.ibitmask>0 and
> cardinality(qin.mat_arr) <=8)
> left join qou on (qou.curr_season=qup.curr_season and
> qou.curr_code=qup.curr_code and qou.ibitmask>0 and
> cardinality(qou.mat_arr) <=11)
> where qup.ibitmask>0 and cardinality(qup.mat_arr) <=21
>
> The plan first retrieves qup and qli, taking the estimated row counts of
> 1163 and 1147 respectively
>
>
> BUT the result is then hashed and the row count is estimated as 33!
>
>
> In a Left join the row count stays always the same as the one of left
> table (here qup with 1163 rows)
>
>
> The same algorithm which reduces the row estimation from 1163 to 33 is
> used in the next step to give an estimation of 1 row.
>
> This is totally wrong.
>
>
> Here is the execution plan of the query:
>
> (search the plan for rows=33)
>
>
>
> QUERY PLAN
> --------------------------------------------------------------------------------------------------------------------------------------------------
> Append (cost=13673.81..17463.30 rows=5734 width=104) (actual
> time=168.307..222.670 rows=9963 loops=1)
> CTE qup
> -> GroupAggregate (cost=5231.22..6303.78 rows=10320 width=80)
> (actual time=35.466..68.131 rows=10735 loops=1)
> Group Key: sa_upper.sup_season, sa_upper.sup_sa_code
> -> Sort (cost=5231.22..5358.64 rows=50969 width=18) (actual
> time=35.454..36.819 rows=50969 loops=1)
> Sort Key: sa_upper.sup_season, sa_upper.sup_sa_code
> COLLATE "C"
> Sort Method: quicksort Memory: 4722kB
> -> Hash Left Join (cost=41.71..1246.13 rows=50969
> width=18) (actual time=0.148..10.687 rows=50969 loops=1)
> Hash Cond: ((sa_upper.sup_mat_code)::text =
> upper_target.up_mat_code)
> -> Seq Scan on sa_upper (cost=0.00..884.69
> rows=50969 width=16) (actual time=0.005..1.972 rows=50969 loops=1)
> -> Hash (cost=35.53..35.53 rows=495 width=6)
> (actual time=0.140..0.140 rows=495 loops=1)
> Buckets: 1024 Batches: 1 Memory Usage: 27kB
> -> Seq Scan on upper_target
> (cost=0.00..35.53 rows=495 width=6) (actual time=0.007..0.103 rows=495
> loops=1)
> Filter: (id_up <= 495)
> Rows Removed by Filter: 1467
> CTE qli
> -> GroupAggregate (cost=1097.31..1486.56 rows=10469 width=80)
> (actual time=9.446..27.388 rows=10469 loops=1)
> Group Key: sa_lining.sli_season, sa_lining.sli_sa_code
> -> Sort (cost=1097.31..1126.74 rows=11774 width=18) (actual
> time=9.440..9.811 rows=11774 loops=1)
> Sort Key: sa_lining.sli_season, sa_lining.sli_sa_code
> COLLATE "C"
> Sort Method: quicksort Memory: 1120kB
> -> Hash Left Join (cost=7.34..301.19 rows=11774
> width=18) (actual time=0.045..2.438 rows=11774 loops=1)
> Hash Cond: ((sa_lining.sli_mat_code)::text =
> lining_target.li_mat_code)
> -> Seq Scan on sa_lining (cost=0.00..204.74
> rows=11774 width=16) (actual time=0.008..0.470 rows=11774 loops=1)
> -> Hash (cost=5.86..5.86 rows=118 width=6)
> (actual time=0.034..0.034 rows=119 loops=1)
> Buckets: 1024 Batches: 1 Memory Usage: 13kB
> -> Seq Scan on lining_target
> (cost=0.00..5.86 rows=118 width=6) (actual time=0.008..0.024 rows=119
> loops=1)
> Filter: (id_li <= 119)
> Rows Removed by Filter: 190
> CTE qin
> -> GroupAggregate (cost=1427.34..1880.73 rows=10678 width=80)
> (actual time=11.424..31.508 rows=10678 loops=1)
> Group Key: sa_insole.sin_season, sa_insole.sin_sa_code
> -> Sort (cost=1427.34..1465.41 rows=15230 width=18) (actual
> time=11.416..11.908 rows=15230 loops=1)
> Sort Key: sa_insole.sin_season, sa_insole.sin_sa_code
> COLLATE "C"
> Sort Method: quicksort Memory: 1336kB
> -> Hash Left Join (cost=10.49..369.26 rows=15230
> width=18) (actual time=0.051..3.108 rows=15230 loops=1)
> Hash Cond: ((sa_insole.sin_mat_code)::text =
> insole_target.in_mat_code)
> -> Seq Scan on sa_insole (cost=0.00..264.30
> rows=15230 width=16) (actual time=0.006..0.606 rows=15230 loops=1)
> -> Hash (cost=9.01..9.01 rows=118 width=6)
> (actual time=0.042..0.043 rows=119 loops=1)
> Buckets: 1024 Batches: 1 Memory Usage: 13kB
> -> Seq Scan on insole_target
> (cost=0.00..9.01 rows=118 width=6) (actual time=0.008..0.032 rows=119
> loops=1)
> Filter: (id_in <= 119)
> Rows Removed by Filter: 362
> CTE qou
> -> GroupAggregate (cost=2366.22..2986.89 rows=10699 width=80)
> (actual time=18.198..41.812 rows=10699 loops=1)
> Group Key: sa_outsole.sou_season, sa_outsole.sou_sa_code
> -> Sort (cost=2366.22..2428.14 rows=24768 width=18) (actual
> time=18.187..18.967 rows=24768 loops=1)
> Sort Key: sa_outsole.sou_season, sa_outsole.sou_sa_code
> COLLATE "C"
> Sort Method: quicksort Memory: 2317kB
> -> Hash Left Join (cost=5.39..558.63 rows=24768
> width=18) (actual time=0.046..5.132 rows=24768 loops=1)
> Hash Cond: ((sa_outsole.sou_mat_code)::text =
> outsole_target.ou_mat_code)
> -> Seq Scan on sa_outsole (cost=0.00..430.68
> rows=24768 width=16) (actual time=0.010..1.015 rows=24768 loops=1)
> -> Hash (cost=5.03..5.03 rows=29 width=6)
> (actual time=0.032..0.032 rows=29 loops=1)
> Buckets: 1024 Batches: 1 Memory Usage: 10kB
> -> Seq Scan on outsole_target
> (cost=0.00..5.03 rows=29 width=6) (actual time=0.010..0.025 rows=29 loops=1)
> Filter: (id_ou <= 29)
> Rows Removed by Filter: 213
> -> Hash Join (cost=1015.85..1319.50 rows=1 width=104) (actual
> time=168.307..215.513 rows=8548 loops=1)
> Hash Cond: ((qou.curr_season = qli.curr_season) AND
> ((qou.curr_code)::text = (qli.curr_code)::text))
> Join Filter: ((((qup.ibitmask | qin.ibitmask) | qli.ibitmask) |
> qou.ibitmask) IS NOT NULL)
> -> CTE Scan on qou (cost=0.00..294.22 rows=1189 width=76)
> (actual time=18.200..45.188 rows=10275 loops=1)
> Filter: ((ibitmask > 0) AND (cardinality(mat_arr) <= 11))
> Rows Removed by Filter: 424
> -> Hash (cost=1015.83..1015.83 rows=1 width=228) (actual
> time=150.094..150.095 rows=8845 loops=1)
> Buckets: 16384 (originally 1024) Batches: 1 (originally
> 1) Memory Usage: 1899kB
> -> Hash Join (cost=707.35..1015.83 rows=1 width=228)
> (actual time=121.898..147.726 rows=8845 loops=1)
> Hash Cond: ((qin.curr_season = qli.curr_season) AND
> ((qin.curr_code)::text = (qli.curr_code)::text))
> -> CTE Scan on qin (cost=0.00..293.65 rows=1186
> width=76) (actual time=11.425..34.674 rows=10197 loops=1)
> Filter: ((ibitmask > 0) AND
> (cardinality(mat_arr) <= 8))
> Rows Removed by Filter: 481
> -> Hash (cost=706.86..706.86 rows=33 width=152)
> (actual time=110.470..110.470 rows=9007 loops=1)
> Buckets: 16384 (originally 1024) Batches: 1
> (originally 1) Memory Usage: 1473kB
> -> Merge Join (cost=689.20..706.86 rows=33
> width=152) (actual time=105.862..108.925 rows=9007 loops=1)
> Merge Cond: ((qup.curr_season =
> qli.curr_season) AND ((qup.curr_code)::text = (qli.curr_code)::text))
> -> Sort (cost=342.09..344.96
> rows=1147 width=76) (actual time=73.419..73.653 rows=9320 loops=1)
> Sort Key: qup.curr_season,
> qup.curr_code COLLATE "C"
> Sort Method: quicksort Memory:
> 1391kB
> -> CTE Scan on qup
> (cost=0.00..283.80 rows=1147 width=76) (actual time=35.467..71.904
> rows=9320 loops=1)
> Filter: ((ibitmask > 0) AND
> (cardinality(mat_arr) <= 21))
> Rows Removed by Filter: 1415
> -> Sort (cost=347.12..350.02
> rows=1163 width=76) (actual time=32.440..32.697 rows=10289 loops=1)
> Sort Key: qli.curr_season,
> qli.curr_code COLLATE "C"
> Sort Method: quicksort Memory:
> 1349kB
> -> CTE Scan on qli
> (cost=0.00..287.90 rows=1163 width=76) (actual time=9.447..30.666
> rows=10289 loops=1)
> Filter: ((ibitmask > 0) AND
> (cardinality(mat_arr) <= 8))
> Rows Removed by Filter: 180
> -> Merge Left Join (cost=2625.49..3399.84 rows=5733 width=104)
> (actual time=4.597..6.700 rows=1415 loops=1)
> Merge Cond: ((qup_1.curr_season = qou_1.curr_season) AND
> ((qup_1.curr_code)::text = (qou_1.curr_code)::text))
> -> Merge Left Join (cost=1958.66..2135.28 rows=5733
> width=136) (actual time=3.427..3.863 rows=1415 loops=1)
> Merge Cond: ((qup_1.curr_season = qin_1.curr_season) AND
> ((qup_1.curr_code)::text = (qin_1.curr_code)::text))
> -> Merge Left Join (cost=1293.25..1388.21 rows=5733
> width=104) (actual time=2.321..2.556 rows=1415 loops=1)
> Merge Cond: ((qup_1.curr_season =
> qli_1.curr_season) AND ((qup_1.curr_code)::text = (qli_1.curr_code)::text))
> -> Sort (cost=641.68..656.02 rows=5733 width=72)
> (actual time=1.286..1.324 rows=1415 loops=1)
> Sort Key: qup_1.curr_season, qup_1.curr_code
> COLLATE "C"
> Sort Method: quicksort Memory: 204kB
> -> CTE Scan on qup qup_1 (cost=0.00..283.80
> rows=5733 width=72) (actual time=0.009..1.093 rows=1415 loops=1)
> Filter: ((ibitmask < 0) OR
> (cardinality(mat_arr) > 21))
> Rows Removed by Filter: 9320
> -> Sort (cost=651.57..666.11 rows=5816 width=72)
> (actual time=1.033..1.038 rows=180 loops=1)
> Sort Key: qli_1.curr_season, qli_1.curr_code
> COLLATE "C"
> Sort Method: quicksort Memory: 41kB
> -> CTE Scan on qli qli_1 (cost=0.00..287.90
> rows=5816 width=72) (actual time=0.055..1.007 rows=180 loops=1)
> Filter: ((ibitmask < 0) OR
> (cardinality(mat_arr) > 8))
> Rows Removed by Filter: 10289
> -> Sort (cost=665.41..680.24 rows=5932 width=72)
> (actual time=1.104..1.117 rows=481 loops=1)
> Sort Key: qin_1.curr_season, qin_1.curr_code
> COLLATE "C"
> Sort Method: quicksort Memory: 68kB
> -> CTE Scan on qin qin_1 (cost=0.00..293.65
> rows=5932 width=72) (actual time=0.016..1.038 rows=481 loops=1)
> Filter: ((ibitmask < 0) OR
> (cardinality(mat_arr) > 8))
> Rows Removed by Filter: 10197
> -> Sort (cost=666.83..681.69 rows=5944 width=72) (actual
> time=1.163..1.174 rows=417 loops=1)
> Sort Key: qou_1.curr_season, qou_1.curr_code COLLATE "C"
> Sort Method: quicksort Memory: 68kB
> -> CTE Scan on qou qou_1 (cost=0.00..294.22 rows=5944
> width=72) (actual time=0.029..1.068 rows=424 loops=1)
> Filter: ((ibitmask < 0) OR (cardinality(mat_arr) > 11))
> Rows Removed by Filter: 10275
> Planning Time: 2.297 ms
> Execution Time: 224.759 ms
> (118 Zeilen)
>
> 3. Slow query from wrong plan as result on similar case with inner join
>
> When the 3 left joins above are changed to inner joins like:
>
> from qup
> join qli on (qli.curr_season=qup.curr_season and
> qli.curr_code=qup.curr_code and qli.ibitmask>0 and
> cardinality(qli.mat_arr) <=8)
> join qin on (qin.curr_season=qup.curr_season and
> qin.curr_code=qup.curr_code and qin.ibitmask>0 and
> cardinality(qin.mat_arr) <=8)
> join qou on (qou.curr_season=qup.curr_season and
> qou.curr_code=qup.curr_code and qou.ibitmask>0 and
> cardinality(qou.mat_arr) <=11)
> where qup.ibitmask>0 and cardinality(qup.mat_arr) <=21
>
> The same rows estimation takes place as with the left joins, but the
> planner now decides to use a nested loop for the last join, which
> results in a 500fold execution time:
>
> QUERY PLAN
> --------------------------------------------------------------------------------------------------------------------------------------------
> Append (cost=13365.31..17472.18 rows=5734 width=104) (actual
> time=139.037..13403.310 rows=9963 loops=1)
> CTE qup
> -> GroupAggregate (cost=5231.22..6303.78 rows=10320 width=80)
> (actual time=35.399..67.102 rows=10735 loops=1)
> Group Key: sa_upper.sup_season, sa_upper.sup_sa_code
> -> Sort (cost=5231.22..5358.64 rows=50969 width=18) (actual
> time=35.382..36.743 rows=50969 loops=1)
> Sort Key: sa_upper.sup_season, sa_upper.sup_sa_code
> COLLATE "C"
> Sort Method: quicksort Memory: 4722kB
> -> Hash Left Join (cost=41.71..1246.13 rows=50969
> width=18) (actual time=0.157..10.715 rows=50969 loops=1)
> Hash Cond: ((sa_upper.sup_mat_code)::text =
> upper_target.up_mat_code)
> -> Seq Scan on sa_upper (cost=0.00..884.69
> rows=50969 width=16) (actual time=0.008..2.001 rows=50969 loops=1)
> -> Hash (cost=35.53..35.53 rows=495 width=6)
> (actual time=0.146..0.146 rows=495 loops=1)
> Buckets: 1024 Batches: 1 Memory Usage: 27kB
> -> Seq Scan on upper_target
> (cost=0.00..35.53 rows=495 width=6) (actual time=0.006..0.105 rows=495
> loops=1)
> Filter: (id_up <= 495)
> Rows Removed by Filter: 1467
> CTE qli
> -> GroupAggregate (cost=1097.31..1486.56 rows=10469 width=80)
> (actual time=9.541..27.419 rows=10469 loops=1)
> Group Key: sa_lining.sli_season, sa_lining.sli_sa_code
> -> Sort (cost=1097.31..1126.74 rows=11774 width=18) (actual
> time=9.534..9.908 rows=11774 loops=1)
> Sort Key: sa_lining.sli_season, sa_lining.sli_sa_code
> COLLATE "C"
> Sort Method: quicksort Memory: 1120kB
> -> Hash Left Join (cost=7.34..301.19 rows=11774
> width=18) (actual time=0.049..2.451 rows=11774 loops=1)
> Hash Cond: ((sa_lining.sli_mat_code)::text =
> lining_target.li_mat_code)
> -> Seq Scan on sa_lining (cost=0.00..204.74
> rows=11774 width=16) (actual time=0.010..0.462 rows=11774 loops=1)
> -> Hash (cost=5.86..5.86 rows=118 width=6)
> (actual time=0.035..0.035 rows=119 loops=1)
> Buckets: 1024 Batches: 1 Memory Usage: 13kB
> -> Seq Scan on lining_target
> (cost=0.00..5.86 rows=118 width=6) (actual time=0.008..0.025 rows=119
> loops=1)
> Filter: (id_li <= 119)
> Rows Removed by Filter: 190
> CTE qin
> -> GroupAggregate (cost=1427.34..1880.73 rows=10678 width=80)
> (actual time=11.649..30.910 rows=10678 loops=1)
> Group Key: sa_insole.sin_season, sa_insole.sin_sa_code
> -> Sort (cost=1427.34..1465.41 rows=15230 width=18) (actual
> time=11.642..12.115 rows=15230 loops=1)
> Sort Key: sa_insole.sin_season, sa_insole.sin_sa_code
> COLLATE "C"
> Sort Method: quicksort Memory: 1336kB
> -> Hash Left Join (cost=10.49..369.26 rows=15230
> width=18) (actual time=0.056..3.144 rows=15230 loops=1)
> Hash Cond: ((sa_insole.sin_mat_code)::text =
> insole_target.in_mat_code)
> -> Seq Scan on sa_insole (cost=0.00..264.30
> rows=15230 width=16) (actual time=0.008..0.594 rows=15230 loops=1)
> -> Hash (cost=9.01..9.01 rows=118 width=6)
> (actual time=0.045..0.046 rows=119 loops=1)
> Buckets: 1024 Batches: 1 Memory Usage: 13kB
> -> Seq Scan on insole_target
> (cost=0.00..9.01 rows=118 width=6) (actual time=0.008..0.034 rows=119
> loops=1)
> Filter: (id_in <= 119)
> Rows Removed by Filter: 362
> CTE qou
> -> GroupAggregate (cost=2366.22..2986.89 rows=10699 width=80)
> (actual time=18.163..51.151 rows=10699 loops=1)
> Group Key: sa_outsole.sou_season, sa_outsole.sou_sa_code
> -> Sort (cost=2366.22..2428.14 rows=24768 width=18) (actual
> time=18.150..20.000 rows=24768 loops=1)
> Sort Key: sa_outsole.sou_season, sa_outsole.sou_sa_code
> COLLATE "C"
> Sort Method: quicksort Memory: 2317kB
> -> Hash Left Join (cost=5.39..558.63 rows=24768
> width=18) (actual time=0.036..5.106 rows=24768 loops=1)
> Hash Cond: ((sa_outsole.sou_mat_code)::text =
> outsole_target.ou_mat_code)
> -> Seq Scan on sa_outsole (cost=0.00..430.68
> rows=24768 width=16) (actual time=0.008..1.005 rows=24768 loops=1)
> -> Hash (cost=5.03..5.03 rows=29 width=6)
> (actual time=0.024..0.024 rows=29 loops=1)
> Buckets: 1024 Batches: 1 Memory Usage: 10kB
> -> Seq Scan on outsole_target
> (cost=0.00..5.03 rows=29 width=6) (actual time=0.007..0.018 rows=29 loops=1)
> Filter: (id_ou <= 29)
> Rows Removed by Filter: 213
> -> Nested Loop (cost=707.35..1328.37 rows=1 width=104) (actual
> time=139.036..13395.820 rows=8548 loops=1)
> Join Filter: ((qli.curr_season = qin.curr_season) AND
> ((qli.curr_code)::text = (qin.curr_code)::text))
> Rows Removed by Join Filter: 88552397
> -> Hash Join (cost=707.35..1016.45 rows=1 width=216) (actual
> time=127.374..168.249 rows=8685 loops=1)
> Hash Cond: ((qou.curr_season = qli.curr_season) AND
> ((qou.curr_code)::text = (qli.curr_code)::text))
> -> CTE Scan on qou (cost=0.00..294.22 rows=1189
> width=72) (actual time=18.165..54.968 rows=10275 loops=1)
> Filter: ((ibitmask > 0) AND (cardinality(mat_arr)
> <= 11))
> Rows Removed by Filter: 424
> -> Hash (cost=706.86..706.86 rows=33 width=144) (actual
> time=109.205..109.207 rows=9007 loops=1)
> Buckets: 16384 (originally 1024) Batches: 1
> (originally 1) Memory Usage: 1369kB
> -> Merge Join (cost=689.20..706.86 rows=33
> width=144) (actual time=104.785..107.748 rows=9007 loops=1)
> Merge Cond: ((qup.curr_season =
> qli.curr_season) AND ((qup.curr_code)::text = (qli.curr_code)::text))
> -> Sort (cost=342.09..344.96 rows=1147
> width=72) (actual time=72.320..72.559 rows=9320 loops=1)
> Sort Key: qup.curr_season,
> qup.curr_code COLLATE "C"
> Sort Method: quicksort Memory: 1357kB
> -> CTE Scan on qup (cost=0.00..283.80
> rows=1147 width=72) (actual time=35.401..70.834 rows=9320 loops=1)
> Filter: ((ibitmask > 0) AND
> (cardinality(mat_arr) <= 21))
> Rows Removed by Filter: 1415
> -> Sort (cost=347.12..350.02 rows=1163
> width=72) (actual time=32.461..32.719 rows=10289 loops=1)
> Sort Key: qli.curr_season,
> qli.curr_code COLLATE "C"
> Sort Method: quicksort Memory: 1269kB
> -> CTE Scan on qli (cost=0.00..287.90
> rows=1163 width=72) (actual time=9.543..30.696 rows=10289 loops=1)
> Filter: ((ibitmask > 0) AND
> (cardinality(mat_arr) <= 8))
> Rows Removed by Filter: 180
> -> CTE Scan on qin (cost=0.00..293.65 rows=1186 width=72)
> (actual time=0.001..1.159 rows=10197 loops=8685)
> Filter: ((ibitmask > 0) AND (cardinality(mat_arr) <= 8))
> Rows Removed by Filter: 481
> -> Merge Left Join (cost=2625.49..3399.84 rows=5733 width=104)
> (actual time=4.606..6.733 rows=1415 loops=1)
> Merge Cond: ((qup_1.curr_season = qou_1.curr_season) AND
> ((qup_1.curr_code)::text = (qou_1.curr_code)::text))
> -> Merge Left Join (cost=1958.66..2135.28 rows=5733
> width=136) (actual time=3.479..3.930 rows=1415 loops=1)
> Merge Cond: ((qup_1.curr_season = qin_1.curr_season) AND
> ((qup_1.curr_code)::text = (qin_1.curr_code)::text))
> -> Merge Left Join (cost=1293.25..1388.21 rows=5733
> width=104) (actual time=2.368..2.610 rows=1415 loops=1)
> Merge Cond: ((qup_1.curr_season =
> qli_1.curr_season) AND ((qup_1.curr_code)::text = (qli_1.curr_code)::text))
> -> Sort (cost=641.68..656.02 rows=5733 width=72)
> (actual time=1.296..1.335 rows=1415 loops=1)
> Sort Key: qup_1.curr_season, qup_1.curr_code
> COLLATE "C"
> Sort Method: quicksort Memory: 204kB
> -> CTE Scan on qup qup_1 (cost=0.00..283.80
> rows=5733 width=72) (actual time=0.010..1.119 rows=1415 loops=1)
> Filter: ((ibitmask < 0) OR
> (cardinality(mat_arr) > 21))
> Rows Removed by Filter: 9320
> -> Sort (cost=651.57..666.11 rows=5816 width=72)
> (actual time=1.069..1.075 rows=180 loops=1)
> Sort Key: qli_1.curr_season, qli_1.curr_code
> COLLATE "C"
> Sort Method: quicksort Memory: 41kB
> -> CTE Scan on qli qli_1 (cost=0.00..287.90
> rows=5816 width=72) (actual time=0.057..1.026 rows=180 loops=1)
> Filter: ((ibitmask < 0) OR
> (cardinality(mat_arr) > 8))
> Rows Removed by Filter: 10289
> -> Sort (cost=665.41..680.24 rows=5932 width=72)
> (actual time=1.110..1.124 rows=481 loops=1)
> Sort Key: qin_1.curr_season, qin_1.curr_code
> COLLATE "C"
> Sort Method: quicksort Memory: 68kB
> -> CTE Scan on qin qin_1 (cost=0.00..293.65
> rows=5932 width=72) (actual time=0.016..1.046 rows=481 loops=1)
> Filter: ((ibitmask < 0) OR
> (cardinality(mat_arr) > 8))
> Rows Removed by Filter: 10197
> -> Sort (cost=666.83..681.69 rows=5944 width=72) (actual
> time=1.119..1.128 rows=417 loops=1)
> Sort Key: qou_1.curr_season, qou_1.curr_code COLLATE "C"
> Sort Method: quicksort Memory: 68kB
> -> CTE Scan on qou qou_1 (cost=0.00..294.22 rows=5944
> width=72) (actual time=0.029..1.056 rows=424 loops=1)
> Filter: ((ibitmask < 0) OR (cardinality(mat_arr) > 11))
> Rows Removed by Filter: 10275
> Planning Time: 1.746 ms
> Execution Time: 13405.503 ms
> (116 Zeilen)
>
> This case really brought me to detect the problem!
>
> The original query and data are not shown here, but the principle should
> be clear from the execution plans.
>
> I think the planner shouldn't change the row estimations on further
> steps after left joins at all, and be a bit more conservative on inner
> joins.
> This may be related to the fact that this case has 2 join-conditions
> (xx_season an xx_code).
>
> Thanks for looking
>
> Hans Buschmann
>
>
>
>
>
>
> !! External Email: This email originated from outside of the
> organization. Do not click links or open attachments unless you
> recognize the sender.
>

--
Tomas Vondra
EnterpriseDB: https://nam04.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.enterprisedb.com%2F&data=05%7C01%7Cgjian%40vmware.com%7C9d40e84af2c946f3517a08db9cc61ee2%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C638276146959658928%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=VzTmxC6ay28C8%2BaA3Dsi%2BDDWxGEgh9UVaPfc%2BMiL5Mo%3D&reserved=0<http://www.enterprisedb.com/>
The Enterprise PostgreSQL Company

!! External Email: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender.

Attachment Content-Type Size
0001-Examine-simple-variable-for-Var-in-CTE.patch text/x-patch 2.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Gustafsson 2023-08-21 08:21:14 Re: pg_upgrade - typo in verbose log
Previous Message Peter Smith 2023-08-21 07:58:25 Re: [PATCH] Reuse Workers and Replication Slots during Logical Replication