Re: BUG #15383: Join Filter cost estimation problem in 10.5

From: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Marko Tiikkaja <marko(at)joh(dot)to>, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #15383: Join Filter cost estimation problem in 10.5
Date: 2018-10-11 05:36:35
Message-ID: CAKJS1f8Esad1fxYBjoKNR58fMrUba2HzdbYFFaBDrzp5U=ztNA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

On 20 September 2018 at 08:18, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> So after poking around for awhile, my conclusion is that the cost
> estimation aspects of the inner_unique patch are completely broken,
> and it's not going to be very easy to fix.
>
> The core issue here is that compute_semi_anti_join_factors was never
> designed to work for any joins that aren't really SEMI or ANTI joins,
> and just taking out the assert about that doesn't fix it. In
> particular, passing jointype == JOIN_SEMI to clauselist_selectivity,
> when the underlying sjinfo has jointype == JOIN_INNER, isn't a supported
> combination. If you look at eqjoinsel() you will notice that it pays
> no attention at all to the jointype parameter, only to sjinfo->jointype.
> Therefore what we get out of the first clauselist_selectivity is the
> same value as we get from the second one (ie, inner-join selectivity)
> leading to entirely insane results from compute_semi_anti_join_factors.

I looked at this again and I see that when we're building the join rel
we call calc_joinrel_size_estimate() to set ->rows. We do this based
on the actual join type as at this stage we've yet to detect if the
join has a unique inner side or not. My original idea with this code
was that unique joins would be costed in the same way as semi joins
which I had hoped would encourage the unique side to be put on the
inner side when the costs were roughly the same. Now in light of this
bug, and given that the rows for the join rel is set based on the
original join type, I think it's better just to pull out the code that
attempts to cost unique joins differently. I don't really see
changing the rows property of the joinrel as something we can actually
do since any updated estimate would be bogus if the join order was
reversed, which is possible for JOIN_INNER with a unique inner, but
not possible for a JOIN_SEMI.

I've attached a patch which is a partial revert of the costing code
that was changed in 9c7f5229ad6.

I'm a bit unsure how good an idea it is to put back the Assert in
compute_semi_anti_join_factors() in the back branches.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
revert_unique_join_costing.patch application/octet-stream 8.5 KB

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Abhishek Tripathi 2018-10-11 09:48:46 Want to acquire lock on tables where primary of one table is foreign key on othere
Previous Message neela 2018-10-11 03:58:49 Re: General: Unable to start the postgresql after copying postgresql.log

Browse pgsql-hackers by date

  From Date Subject
Next Message Dilip Kumar 2018-10-11 06:00:24 Undo worker and transaction rollback
Previous Message Amit Khandekar 2018-10-11 05:03:22 Re: TupleTableSlot abstraction