From: | Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: [HACKERS] <> join selectivity estimate question |
Date: | 2017-12-01 00:16:41 |
Message-ID: | CAEepm=1J7F+VpSViTLs_Te+Fuzrg3qJK6osEABd=2yMMHF-JMA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, Dec 1, 2017 at 10:41 AM, Thomas Munro
<thomas(dot)munro(at)enterprisedb(dot)com> wrote:
> On Fri, Dec 1, 2017 at 4:05 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> Hmm, do you have an example of the better but still-funky estimates
>> handy? Like an EXPLAIN plan?
>
> Sure. Here's some EXPLAIN ANALYZE output from scale 3 TPCH + a few
> indexes[1]. There's a version from HEAD with and without commit
> 7ca25b7d.
So, in that plan we saw anti-join estimate 1 row but really there were
13462. If you remove most of Q21 and keep just the anti-join between
l1 and l3, then you try removing different quals, you can see the the
problem is not the <> qual:
select count(*)
from lineitem l1
where not exists (
select *
from lineitem l3
where l3.l_orderkey = l1.l_orderkey
and l3.l_suppkey <> l1.l_suppkey
and l3.l_receiptdate > l3.l_commitdate
)
=> estimate=1 actual=8998304
select count(*)
from lineitem l1
where not exists (
select *
from lineitem l3
where l3.l_suppkey <> l1.l_suppkey
)
=> estimate=1 actual=0
select count(*)
from lineitem l1
where not exists (
select *
from lineitem l3
where l3.l_orderkey = l1.l_orderkey
)
=> estimate=1 actual=0
select count(*)
from lineitem l1
where not exists (
select *
from lineitem l3
where l3.l_orderkey = l1.l_orderkey
and l3.l_receiptdate > l3.l_commitdate
)
=> estimate=1 actual=294884
The = and <> quals see to be estimated well, but when that filter on
l3 is added we go off the rails. It removes about 37% of the rows in
l3, and means that we sometimes don't find a match, so the anti-join
produces some rows.
--
Thomas Munro
http://www.enterprisedb.com
From | Date | Subject | |
---|---|---|---|
Next Message | David Rowley | 2017-12-01 00:33:04 | Re: Combine function returning NULL unhandled? |
Previous Message | Michael Paquier | 2017-11-30 22:59:26 | Re: Commit fest 2017-11 |