Quick Links

Re: <> join selectivity estimate question

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: <> join selectivity estimate question
Date:	2017-03-17 18:24:14
Message-ID:	CA+TgmobL9r_eAN8wXA9ruSgvYZXvHX2You30ZxpXEd6A9da_dQ@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Fri, Mar 17, 2017 at 1:14 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> After a bit more thought, it seems like the bug here is that "the
> fraction of the LHS that has a non-matching row" is not one minus
> "the fraction of the LHS that has a matching row". In fact, in
> this example, *all* LHS rows have both matching and non-matching
> RHS rows. So the problem is that neqjoinsel is doing something
> that's entirely insane for semijoin cases.

Thanks for the analysis. I had a niggling feeling that there might be
something of this sort going on, but I was not sure.

> It would not be too hard to convince me that neqjoinsel should
> simply return 1.0 for any semijoin/antijoin case, perhaps with
> some kind of discount for nullfrac. Whether or not there's an
> equal row, there's almost always going to be non-equal row(s).
> Maybe we can think of a better implementation but that seems
> like the zero-order approximation.

Yeah, it's not obvious how to do better than that considering only one
clause at a time. Of course, what we really want to know is
P(x<>y|z=t), but don't ask me how to compute that.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Re: <> join selectivity estimate question at 2017-03-17 17:14:12 from Tom Lane

Responses

Re: <> join selectivity estimate question at 2017-03-17 19:11:30 from Tom Lane

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Jeff Janes	2017-03-17 18:37:50	free space map and visibility map
Previous Message	Robert Haas	2017-03-17 18:18:57	Re: merging duplicate definitions of adjust_relid_set