Re: <> join selectivity estimate question

From: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: <> join selectivity estimate question
Date: 2017-03-17 22:59:51
Message-ID: CAEepm=0AHTPE2FtoV1=gYZH-=YxuWY0ufk-S=3dVrf+enmqA7A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Mar 18, 2017 at 11:49 AM, Thomas Munro
<thomas(dot)munro(at)enterprisedb(dot)com> wrote:
> On Sat, Mar 18, 2017 at 6:14 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> After a bit more thought, it seems like the bug here is that "the
>> fraction of the LHS that has a non-matching row" is not one minus
>> "the fraction of the LHS that has a matching row". In fact, in
>> this example, *all* LHS rows have both matching and non-matching
>> RHS rows. So the problem is that neqjoinsel is doing something
>> that's entirely insane for semijoin cases.
>>
>> It would not be too hard to convince me that neqjoinsel should
>> simply return 1.0 for any semijoin/antijoin case, perhaps with
>> some kind of discount for nullfrac. Whether or not there's an
>> equal row, there's almost always going to be non-equal row(s).
>> Maybe we can think of a better implementation but that seems
>> like the zero-order approximation.
>
> Right. If I temporarily hack neqjoinsel() thus:
>
> result = 1.0 - result;
> +
> + if (jointype == JOIN_SEMI)
> + result = 1.0;
> +
> PG_RETURN_FLOAT8(result);
> }
>
> ... then I obtain sensible row estimates and the following speedups
> for TPCH Q21:
>
> 8 workers = 8.3s -> 7.8s
> 7 workers = 8.2s -> 7.9s
> 6 workers = 8.5s -> 8.2s
> 5 workers = 8.9s -> 8.5s
> 4 workers = 9.5s -> 9.1s
> 3 workers = 39.7s -> 9.9s
> 2 workers = 36.9s -> 11.7s
> 1 worker = 38.2s -> 15.0s
> 0 workers = 47.9s -> 24.7s
>
> The plan is similar to the good plan from before even at lower worker
> counts, but slightly better because the aggregation has been pushed
> under the Gather node. See attached.

... and so has the anti-join, probably more importantly.

Thanks for looking at this!

--
Thomas Munro
http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2017-03-18 00:02:19 Re: increasing the default WAL segment size
Previous Message Thomas Munro 2017-03-17 22:49:13 Re: <> join selectivity estimate question