Re: BUG #15618: Duplicating a join clause breaks index use

From: Alex <cdalxndr(at)yahoo(dot)com>
To: Pantelis Theodosiou <ypercube(at)gmail(dot)com>
Cc: pgsql-bugs(at)lists(dot)postgresql(dot)org, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Subject: Re: BUG #15618: Duplicating a join clause breaks index use
Date: 2019-02-02 14:04:39
Message-ID: 197491491.1997073.1549116279483@mail.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Also, note that if I remove any of the following joins and their columns from select, while keeping the duplicate join, the issue doesn't reproduce anymore (query is fast):left outer join product_property_value ppv on pp.value_id=ppv.idleft outer join description_value depv on dep.description_value_id=depv.idleft outer join description_value dev on de.description_value_id=dev.id
On Saturday, February 2, 2019, 3:07:57 PM GMT+2, Alex <cdalxndr(at)yahoo(dot)com> wrote:

Yes.

On Saturday, February 2, 2019, 2:31:23 PM GMT+2, Pantelis Theodosiou <ypercube(at)gmail(dot)com> wrote:

Is there a foreign key from product (shop_category) that references shop_category (id) ?

On Sat, Feb 2, 2019 at 12:17 PM Alex <cdalxndr(at)yahoo(dot)com> wrote:

Query:
select
 pp.id as id1_13_0_,
 ppv.id as id2_14_1_,
 de.id as id1_3_2_,
 dev.id as id1_5_3_,
 dep.id as id1_3_4_,
 depv.id as id1_5_5_,
 pp.description_source as descript5_13_0_,
 pp.meaning as meaning2_13_0_,
 pp.secondary_meaning as secondar3_13_0_,
 pp.source as source4_13_0_,
 pp.value_id as value_id6_13_0_,
 ppv.first as first3_14_1_,
 ppv.third as third4_14_1_,
 ppv.second as second5_14_1_,
 ppv.text as text6_14_1_,
 ppv.type as type1_14_1_,
 de.parent as parent3_3_2_,
 de.pos as pos2_3_2_,
 de.description_value_id as descript4_3_2_,
 de.product as product5_3_2_,
 dev.language as language2_5_3_,
 dev.lexeme as lexeme3_5_3_,
 dev.value as value4_5_3_,
 dep.parent as parent3_3_4_,
 dep.pos as pos2_3_4_,
 dep.description_value_id as descript4_3_4_,
 dep.product as product5_3_4_,
 depv.language as language2_5_5_,
 depv.lexeme as lexeme3_5_5_,
 depv.value as value4_5_5_
from
    product p
    left outer join product_property pp on p.id=pp.product_id
    left outer join product_property_value ppv on pp.value_id=ppv.id
    left outer join description_element de on pp.description_source=de.id
    left outer join description_value dev on de.description_value_id=dev.id
    left outer join description_element dep on de.parent=dep.id
    left outer join description_value depv on dep.description_value_id=depv.id
    left outer join shop_category sc_dup on p.shop_category=sc_dup.id           --duplicate join
    left outer join shop_category sc on p.shop_category=sc.id
    left outer join category c on sc.category=c.id
where
    c.id=8511

Explain analyze with duplicate join:

"Gather  (cost=134471.60..285324.15 rows=48 width=320) (actual time=1746.142..3854.899 rows=90 loops=1)"
"  Workers Planned: 2"
"  Workers Launched: 2"
"  ->  Hash Join  (cost=133471.60..284319.35 rows=20 width=320) (actual time=1753.615..3702.835 rows=30 loops=3)"
"        Hash Cond: (p.shop_category = sc.id)"
"        ->  Hash Left Join  (cost=133454.98..283698.84 rows=160986 width=324) (actual time=1661.551..3691.094 rows=131278 loops=3)"
"              Hash Cond: (dep.description_value_id = depv.id)"
"              ->  Nested Loop Left Join  (cost=119237.57..253506.99 rows=160986 width=218) (actual time=1465.635..3282.286 rows=131278 loops=3)"
"                    ->  Hash Left Join  (cost=119237.14..175609.97 rows=160986 width=198) (actual time=1465.625..2561.546 rows=131278 loops=3)"
"                          Hash Cond: (de.description_value_id = dev.id)"
"                          ->  Hash Left Join  (cost=105019.73..150448.12 rows=160986 width=92) (actual time=1260.801..2160.415 rows=131278 loops=3)"
"                                Hash Cond: (pp.value_id = ppv.id)"
"                                ->  Hash Left Join  (cost=90691.88..127921.82 rows=160986 width=52) (actual time=1096.581..1811.725 rows=131278 loops=3)"
"                                      Hash Cond: (pp.description_source = de.id)"
"                                      ->  Hash Left Join  (cost=14036.24..34090.73 rows=160986 width=32) (actual time=187.524..370.095 rows=131278 loops=3)"
"                                            Hash Cond: (p.id = pp.product_id)"
"                                            ->  Parallel Seq Scan on product p  (cost=0.00..11757.46 rows=37946 width=8) (actual time=0.118..55.751 rows=30357 loops=3)"
"                                            ->  Hash  (cost=6564.66..6564.66 rows=386366 width=32) (actual time=185.622..185.622 rows=386366 loops=3)"
"                                                  Buckets: 65536  Batches: 8  Memory Usage: 3390kB"
"                                                  ->  Seq Scan on product_property pp  (cost=0.00..6564.66 rows=386366 width=32) (actual time=0.103..80.719 rows=386366 loops=3)"
"                                      ->  Hash  (cost=36108.73..36108.73 rows=2208473 width=20) (actual time=905.265..905.265 rows=2208473 loops=3)"
"                                            Buckets: 65536  Batches: 64  Memory Usage: 2254kB"
"                                            ->  Seq Scan on description_element de  (cost=0.00..36108.73 rows=2208473 width=20) (actual time=0.101..359.710 rows=2208473 loops=3)"
"                                ->  Hash  (cost=6477.38..6477.38 rows=386438 width=40) (actual time=160.004..160.004 rows=386438 loops=3)"
"                                      Buckets: 65536  Batches: 8  Memory Usage: 3050kB"
"                                      ->  Seq Scan on product_property_value ppv  (cost=0.00..6477.38 rows=386438 width=40) (actual time=0.108..62.499 rows=386438 loops=3)"
"                          ->  Hash  (cost=6865.96..6865.96 rows=252596 width=106) (actual time=204.657..204.657 rows=252596 loops=3)"
"                                Buckets: 32768  Batches: 16  Memory Usage: 2449kB"
"                                ->  Seq Scan on description_value dev  (cost=0.00..6865.96 rows=252596 width=106) (actual time=0.102..80.803 rows=252596 loops=3)"
"                    ->  Index Scan using description_element_pkey on description_element dep  (cost=0.43..0.48 rows=1 width=20) (actual time=0.005..0.005 rows=1 loops=393835)"
"                          Index Cond: (de.parent = id)"
"              ->  Hash  (cost=6865.96..6865.96 rows=252596 width=106) (actual time=195.776..195.776 rows=252596 loops=3)"
"                    Buckets: 32768  Batches: 16  Memory Usage: 2449kB"
"                    ->  Seq Scan on description_value depv  (cost=0.00..6865.96 rows=252596 width=106) (actual time=0.019..74.589 rows=252596 loops=3)"
"        ->  Hash  (cost=16.61..16.61 rows=1 width=4) (actual time=0.232..0.232 rows=1 loops=3)"
"              Buckets: 1024  Batches: 1  Memory Usage: 9kB"
"              ->  Nested Loop  (cost=0.56..16.61 rows=1 width=4) (actual time=0.224..0.226 rows=1 loops=3)"
"                    ->  Index Scan using shop_category_category_idx on shop_category sc  (cost=0.28..8.30 rows=1 width=8) (actual time=0.095..0.095 rows=1 loops=3)"
"                          Index Cond: (category = 8511)"
"                    ->  Index Only Scan using category_pkey on category c  (cost=0.28..8.29 rows=1 width=4) (actual time=0.125..0.126 rows=1 loops=3)"
"                          Index Cond: (id = 8511)"
"                          Heap Fetches: 1"
"Planning time: 2.543 ms"
"Execution time: 3857.492 ms"

Explain analyze without duplicate join:
"Nested Loop  (cost=3.40..216.69 rows=48 width=320) (actual time=0.203..1.624 rows=90 loops=1)"
"  ->  Index Only Scan using category_pkey on category c  (cost=0.28..8.29 rows=1 width=4) (actual time=0.049..0.049 rows=1 loops=1)"
"        Index Cond: (id = 8511)"
"        Heap Fetches: 1"
"  ->  Nested Loop Left Join  (cost=3.12..207.91 rows=48 width=324) (actual time=0.152..1.556 rows=90 loops=1)"
"        ->  Nested Loop Left Join  (cost=2.70..186.48 rows=48 width=218) (actual time=0.149..1.416 rows=90 loops=1)"
"              ->  Nested Loop Left Join  (cost=2.27..163.25 rows=48 width=198) (actual time=0.141..1.322 rows=90 loops=1)"
"                    ->  Nested Loop Left Join  (cost=1.85..141.81 rows=48 width=92) (actual time=0.138..1.084 rows=90 loops=1)"
"                          ->  Nested Loop Left Join  (cost=1.42..110.35 rows=48 width=72) (actual time=0.133..0.967 rows=90 loops=1)"
"                                ->  Nested Loop Left Join  (cost=1.00..87.41 rows=48 width=32) (actual time=0.101..0.651 rows=90 loops=1)"
"                                      ->  Nested Loop  (cost=0.57..78.82 rows=11 width=8) (actual time=0.069..0.432 rows=27 loops=1)"
"                                            ->  Index Scan using shop_category_category_idx on shop_category sc  (cost=0.28..8.30 rows=1 width=8) (actual time=0.036..0.036 rows=1 loops=1)"
"                                                  Index Cond: (category = 8511)"
"                                            ->  Index Scan using product_shop_category_idx on product p  (cost=0.29..70.34 rows=18 width=8) (actual time=0.031..0.390 rows=27 loops=1)"
"                                                  Index Cond: (shop_category = sc.id)"
"                                      ->  Index Scan using product_property_product_id_idx on product_property pp  (cost=0.42..0.71 rows=7 width=32) (actual time=0.006..0.007 rows=3 loops=27)"
"                                            Index Cond: (p.id = product_id)"
"                                ->  Index Scan using product_property_value_pkey on product_property_value ppv  (cost=0.42..0.48 rows=1 width=40) (actual time=0.003..0.003 rows=1 loops=90)"
"                                      Index Cond: (pp.value_id = id)"
"                          ->  Index Scan using description_element_pkey on description_element de  (cost=0.43..0.66 rows=1 width=20) (actual time=0.001..0.001 rows=0 loops=90)"
"                                Index Cond: (pp.description_source = id)"
"                    ->  Index Scan using description_value_pkey on description_value dev  (cost=0.42..0.45 rows=1 width=106) (actual time=0.002..0.002 rows=0 loops=90)"
"                          Index Cond: (de.description_value_id = id)"
"              ->  Index Scan using description_element_pkey on description_element dep  (cost=0.43..0.48 rows=1 width=20) (actual time=0.001..0.001 rows=0 loops=90)"
"                    Index Cond: (de.parent = id)"
"        ->  Index Scan using description_value_pkey on description_value depv  (cost=0.42..0.45 rows=1 width=106) (actual time=0.001..0.001 rows=0 loops=90)"
"              Index Cond: (dep.description_value_id = id)"
"Planning time: 2.689 ms"
"Execution time: 1.746 ms"
Note that both query results are the same. Using order by all columns in select results in same content.
On Friday, February 1, 2019, 8:25:44 PM GMT+2, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:

pá 1. 2. 2019 v 19:03 odesílatel PG Bug reporting form <noreply(at)postgresql(dot)org> napsal:

The following bug has been logged on the website:

Bug reference:      15618
Logged by:          cd a
Email address:      cdalxndr(at)yahoo(dot)com
PostgreSQL version: 10.3
Operating system:   Windows 10
Description:       

Adding a unused join clause to the query, modifies the run time from 50ms to
4seconds.
No columns from the join clause are returned and the joined table is not
used in other places.
The join clause is a duplicate of another one that is used for further
joining.

The joined clause should have been automatically removed as it has no side
effects, thus using indices and optimizing query runtime.

Please, can you send a example. Generally, you can remove only some outer join clauses. unused inner join cannot be removed, because it can has impact on result.
Regards
Pavel Stehule

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2019-02-02 15:08:05 Re: BUG #15618: Duplicating a join clause breaks index use
Previous Message Alex 2019-02-02 13:07:57 Re: BUG #15618: Duplicating a join clause breaks index use