Re: BUG #15383: Join Filter cost estimation problem in 10.5

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
Cc: Marko Tiikkaja <marko(at)joh(dot)to>, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #15383: Join Filter cost estimation problem in 10.5
Date: 2019-03-08 23:04:16
Message-ID: 22099.1552086256@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-bugs

David Rowley <david(dot)rowley(at)2ndquadrant(dot)com> writes:
> I've attached a patch which is a partial revert of the costing code
> that was changed in 9c7f5229ad6.

I've spent some time thinking about this (sorry for the delay).
I don't think I like doing what you suggest here. While it makes
Marko's example better, it will make other cases worse, because
it effectively causes the planner not to give any cost preference
to inner_unique plans. That means it'll sometimes choose another
plan over a plan making use of inner_unique when we don't want it
to. So that's not very nice, and it definitely makes me not want
to back-patch this: people aren't that happy with back-patching
plan-destabilizing changes even when we think they are 100% wins.

So what I'm wondering about is if we can't fix
compute_semi_anti_join_factors() to produce something sane without
trying to apply JOIN_SEMI estimation mode to non-semijoins. It
only has to compute two things, and match_count seems pretty
trivial: it's 1 by definition for an inner_unique case. All we
need is a rule for computing outer_match_frac.

I experimented with the attached, which estimates outer_match_frac
as joinrel size divided by outer rel size, relying on the knowledge
that no outer row matches more than one inner row. This makes
Marko's case better, but it's not completely better: it underestimates
the cost of the expensive function by 3X compared to what pre-v10 did.
That is due to the point I made upthread:

>> Also worth noting here is that it's wrong in the first place to be
>> including the selectivity of the inequality qual in our calculation
>> of how many rows are going to be fed to the inequality qual :-(.
>> Again, the infrastructure involved isn't really broken for its
>> designed purpose, because with a normal semijoin there aren't any
>> filter quals to be considered separately; but that assumption breaks
>> down when you try to use that infrastructure for a regular join that
>> happens to have a unique inner side.

We're including the estimated selectivity of the expensive_func qual in
our estimate of how many rows will be fed to the expensive function :-(.

It's not easy to make this better, though: the places that use
outer_match_frac aren't really drawing a distinction between filter
quals and quals that relate to the uniqueness property, and that's
what we'd need in order to have a sane estimate here. Nor does
compute_semi_anti_join_factors have any idea which quals are which.

I think we'd have to redesign those cost calculations completely to
get ideal answers, and I don't especially want to do that right now.
So I'm wondering about the attached as a stopgap. It seems like it's
an improvement over what we have, even if not a full fix.

(BTW, another point is that the executor is also imperfect here:
it will keep scanning if the "expensive_func(gs1.i + gs2.i) > 0"
qual fails, though it could stop, because the inner_unique property
is not dependent on that qual.)

Notes:

* For ease of review, I didn't reindent the body of
compute_semi_anti_join_factors, though it'd need it.

* The one join.sql test case that changes plan seems to need a redesign;
the new plan is correct, but it seems like it's not testing what its
comment says it's for.

Thoughts?

regards, tom lane

Attachment Content-Type Size
unique-join-costing-stopgap-1.patch text/x-diff 3.2 KB

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Michael Paquier 2019-03-09 01:35:57 Re: BUG #15631: Generated as identity field in a temporary table with on commit drop corrupts system catalogs
Previous Message PG Bug reporting form 2019-03-08 21:01:34 BUG #15680: New Versions of pgadmin4 and psqlodbc come with OLD Dlls