BUG #15383: Join Filter cost estimation problem in 10.5

From: PG Bug reporting form <noreply(at)postgresql(dot)org>
To: pgsql-bugs(at)lists(dot)postgresql(dot)org
Cc: marko(at)joh(dot)to
Subject: BUG #15383: Join Filter cost estimation problem in 10.5
Date: 2018-09-13 10:45:21
Message-ID: 153683552113.22350.18441286362867559841@wrigleys.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

The following bug has been logged on the website:

Bug reference: 15383
Logged by: Marko Tiikkaja
Email address: marko(at)joh(dot)to
PostgreSQL version: 10.5
Operating system: Linux
Description:

I was looking at a problematic plan submitted by "sjamaan" on IRC and I
noticed that the join filter estimation seems completely wrong here:

create function expensive_func(int) returns int as $$ begin return 1; end $$
language plpgsql stable cost 10000;
create table unique_inner(a int primary key);
insert into unique_inner select generate_series(1, 10000);

explain select * from unique_inner gs1(i) join generate_series(1, 10) gs2(i)
using (i) where expensive_func(gs1.i + gs2.i) > 0;
QUERY PLAN
----------------------------------------------------------------------------------
Hash Join (cost=303.19..315.81 rows=333 width=4)
Hash Cond: (gs2.i = gs1.i)
Join Filter: (expensive_func((gs1.i + gs2.i)) > 0)
-> Function Scan on generate_series gs2 (cost=0.00..10.00 rows=1000
width=4)
-> Hash (cost=159.75..159.75 rows=11475 width=4)
-> Seq Scan on unique_inner gs1 (cost=0.00..159.75 rows=11475
width=4)
(6 rows)

(Notice how even though the function is expected to be called at least 333
times, the cost doesn't account for even a single call.)

Dropping the primary key constraint makes the costs more reasonable (though
I'm still not sure how the planner arrives at these costs):

alter table unique_inner drop constraint unique_inner_pkey;
explain select * from unique_inner gs1(i) join generate_series(1, 10) gs2(i)
using (i) where expensive_func(gs1.i + gs2.i) > 0;
QUERY PLAN
----------------------------------------------------------------------------------------
Hash Join (cost=22.50..1436880.94 rows=19125 width=4)
Hash Cond: (gs1.i = gs2.i)
Join Filter: (expensive_func((gs1.i + gs2.i)) > 0)
-> Seq Scan on unique_inner gs1 (cost=0.00..159.75 rows=11475
width=4)
-> Hash (cost=10.00..10.00 rows=1000 width=4)
-> Function Scan on generate_series gs2 (cost=0.00..10.00
rows=1000 width=4)
(6 rows)

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Dilip Kumar 2018-09-13 12:00:08 Re: BUG #15324: Non-deterministic behaviour from parallelised sub-query
Previous Message Mareks Kalnačs 2018-09-13 10:19:04 RE: PostgreSQL 10.0 SELECT LIMIT performance problem

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2018-09-13 11:08:30 Re: Problem while setting the fpw with SIGHUP
Previous Message Kyotaro HORIGUCHI 2018-09-13 09:29:31 Re: [HACKERS] Restricting maximum keep segments by repslots