Re: Bogus nestloop rows estimate in 8.4.7

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Marti Raudsepp <marti(at)juffo(dot)org>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Bogus nestloop rows estimate in 8.4.7
Date: 2012-05-29 16:22:18
Message-ID: CA+TgmoagquAwVpk=mYUaLnQ27guYpz4naUp04_OqKuOz=PFFJw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, May 28, 2012 at 6:37 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Marti Raudsepp <marti(at)juffo(dot)org> writes:
>> On Mon, May 28, 2012 at 11:23 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> However, the error in your original example is far too large to be
>>> explained by that, so I think it was tripping over something different.
>
>> Good point. But I generated a bigger data set with the above test case
>> and it gets progressively worse with more rows and partitions. (The
>> original database has 2x4 billion rows in over 100 partitions)
>
>> Here's a bigger test case, 2GB total (will take a few minutes to
>> generate). It gives a total estimate of 3900158, even though the
>> Append nodes suggest 13x2406 rows.
>
> On reflection I think this is an artifact of the lack of
> inheritance-tree stats in 8.4.  The estimated size of the join does
> *not* come from the product of the two appendrel sizes shown in EXPLAIN,
> because the inner one is a inner indexscan using a parameter from the
> outer side (what we would now call a parameterized path).  Rather, the
> estimated size is join selectivity times outer relation size times inner
> relation size.  The outer relation size, after applying its restriction
> clause, is indeed only 13 rows, but the inner relation size is 60e6 rows
> because it has no restriction clause.  If we had an accurate join
> selectivity estimate that'd be fine, but for lack of any stats about the
> inheritance tree eqjoinsel just punts and returns DEFAULT_EQ_SEL, ie
> 0.005.  And that works out to your result.

Hmm, but isn't this a case of the left hand not knowing what the right
hand is doing? I mean, somehow we have enough information to estimate
that the index scans on b{1,2,3} are going to produce 2 rows per
execution, but having figured that out (correctly) we then proceed to
ignore it.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2012-05-29 16:35:57 Re: Bogus nestloop rows estimate in 8.4.7
Previous Message Robert Haas 2012-05-29 16:10:41 Re: Uh, I change my mind about commit_delay + commit_siblings (sort of)