Re: 200 = 199 + 1?

From: Marko Tiikkaja <marko(at)joh(dot)to>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: 200 = 199 + 1?
Date: 2017-09-27 20:17:18
Message-ID: CAL9smLD9x1tccjtrLXvytH-LXzot-pctPKXgsH75vyyj4RLDoQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Sep 27, 2017 at 5:45 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Marko Tiikkaja <marko(at)joh(dot)to> writes:
> > I wonder if the nested loop shouldn't have some kind of a cap on its own
> > estimate if it's wildly off of what you'd get by multiplying the child
> > nodes' estimates with each other?
>
> Nonstarter I'm afraid. The join relation's size estimate is determined
> long before we get to a point where we could multiply the sizes of these
> particular child paths to arrive at the conclusion that it should be
> something different than what we decided originally.

Ah hah. Thanks for the explanation, that makes sense.

> Adjusting the size
> of the nestloop result at that point would merely give it an unfair
> advantage over other paths for the same join relation. (I think it would
> also break some assumptions about paths for the same relation all giving
> the same number of rows, unless parameterized.)
>

With the previous paragraph in mind, I would agree; it's not a very good
idea.

> Looking at it another way, the main thing that the combination of hashagg
> outer path + indexscan inner path knows that eqjoinsel_semi didn't account
> for is that there's a unique index on foo.id. But that info is available
> to eqjoinsel_semi, in the sense that it's been given a nondefault estimate
> that nd1 is equal to the outer relation size. So the mistake that it's
> making is to throw up its hands and use an 0.5 selectivity estimate just
> because it has no info about the inner relation. I think if we'd pushed
> through the nd2/nd1 calculation after setting nd2 = size of inner rel,
> we'd end up with an estimate matching the product of these path sizes.
> (Caution: inadequate caffeine absorbed yet, this might be all wrong.)
>

This sounds very reasonable to me.

.m

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2017-09-27 20:41:52 Re: proposal - Default namespaces for XPath expressions (PostgreSQL 11)
Previous Message Tom Lane 2017-09-27 19:50:05 Re: Binary search in fmgr_isbuiltin() is a bottleneck.