Re: Bogus nestloop rows estimate in 8.4.7

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Marti Raudsepp <marti(at)juffo(dot)org>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Bogus nestloop rows estimate in 8.4.7
Date: 2012-05-28 20:23:06
Message-ID: 12544.1338236586@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Marti Raudsepp <marti(at)juffo(dot)org> writes:
> On Mon, May 28, 2012 at 10:32 PM, Marti Raudsepp <marti(at)juffo(dot)org> wrote:
>> There was a similar case in 9.0.4 with WHERE i=1, but that has been
>> fixed in 9.0.7

> Oh, it's been fixed in 9.0.7, but apparently not in 8.4.11; the empty
> parent tables are confusing the estimate:

Hmm ... what your test case seems to be exhibiting is related to this:

Author: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Branch: master [f3ff0433a] 2011-07-14 17:30:57 -0400
Branch: REL9_1_STABLE Release: REL9_1_0 [cf8245285] 2011-07-14 17:31:12 -0400
Branch: REL9_0_STABLE Release: REL9_0_5 [0dd46a776] 2011-07-14 17:31:25 -0400

In planner, don't assume that empty parent tables aren't really empty.

There's a heuristic in estimate_rel_size() to clamp the minimum size
estimate for a table to 10 pages, unless we can see that vacuum or analyze
has been run (and set relpages to something nonzero, so this will always
happen for a table that's actually empty). However, it would be better
not to do this for inheritance parent tables, which very commonly are
really empty and can be expected to stay that way. Per discussion of a
recent pgsql-performance report from Anish Kejariwal. Also prevent it
from happening for indexes (although this is more in the nature of
documentation, since CREATE INDEX normally initializes relpages to
something nonzero anyway).

Back-patch to 9.0, because the ability to collect statistics across a
whole inheritance tree has improved the planner's estimates to the point
where this relatively small error makes a significant difference. In the
referenced report, merge or hash joins were incorrectly estimated as
cheaper than a nestloop with inner indexscan on the inherited table.
That was less likely before 9.0 because the lack of inherited stats would
have resulted in a default (and rather pessimistic) estimate of the cost
of a merge or hash join.

However, the error in your original example is far too large to be
explained by that, so I think it was tripping over something different.
When I run your test case in 8.4, I get

Nested Loop (cost=0.00..1276.12 rows=28168 width=4)
Join Filter: (public.a_parent.i = public.b_parent.i)
-> Append (cost=0.00..62.57 rows=14 width=4)
-> Seq Scan on a_parent (cost=0.00..46.00 rows=12 width=4)
Filter: ((i >= 1) AND (i <= 2))
-> Index Scan using a1_i_idx on a_child1 a_parent (cost=0.00..8.29 rows=1 width=4)
...
-> Append (cost=0.00..56.63 rows=2404 width=4)
-> Seq Scan on b_parent (cost=0.00..34.00 rows=2400 width=4)
-> Index Scan using b1_i_idx on b_child1 b_parent (cost=0.00..11.34 rows=2 width=4)
...

and that join size estimate is not too out of line if you accept the
admittedly-bogus numbers for the appendrel sizes. There seems to be
something else going on in your original example.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Magnus Hagander 2012-05-28 20:38:24 Re: pg_basebackup --xlog compatibility break
Previous Message Peter Eisentraut 2012-05-28 20:11:03 pg_basebackup --xlog compatibility break