Re: <> join selectivity estimate question

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: <> join selectivity estimate question
Date: 2017-03-17 19:11:30
Message-ID: 25906.1489777890@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Fri, Mar 17, 2017 at 1:14 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> It would not be too hard to convince me that neqjoinsel should
>> simply return 1.0 for any semijoin/antijoin case, perhaps with
>> some kind of discount for nullfrac. Whether or not there's an
>> equal row, there's almost always going to be non-equal row(s).
>> Maybe we can think of a better implementation but that seems
>> like the zero-order approximation.

> Yeah, it's not obvious how to do better than that considering only one
> clause at a time. Of course, what we really want to know is
> P(x<>y|z=t), but don't ask me how to compute that.

Yeah. Another hole in this solution is that it means that the
estimate for x <> y will be quite different from the estimate
for NOT(x = y). You wouldn't notice it in the field unless
somebody forgot to put a negator link on their equality operator,
but it seems like ideally we'd think of a solution that made sense
for generic NOT in this context.

No, I have no idea how to do that.

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2017-03-17 20:08:32 Re: pageinspect and hash indexes
Previous Message Tom Lane 2017-03-17 19:06:14 Re: [COMMITTERS] pgsql: Use asynchronous connect API in libpqwalreceiver