Re: planner or statistical bug on 8.5

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Matteo Beccati <php(at)beccati(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: planner or statistical bug on 8.5
Date: 2010-01-12 18:13:40
Message-ID: 603c8f071001121013i3f8746cbk17eff500d6f7b07c@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jan 12, 2010 at 3:08 AM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
> 2010/1/12 Matteo Beccati <php(at)beccati(dot)com>:
>> Il 12/01/2010 08:55, Pavel Stehule ha scritto:
>>>
>>> I checked query and I was surprised with very strange plan:
>>>
>>> postgres=# explain select  a, b from a,b,c;
>>>                               QUERY PLAN
>>> -----------------------------------------------------------------------
>>>  Nested Loop  (cost=0.00..276595350.00 rows=13824000000 width=8)
>>>    ->   Nested Loop  (cost=0.00..115292.00 rows=5760000 width=8)
>>>          ->   Seq Scan on a  (cost=0.00..34.00 rows=2400 width=4)
>>>          ->   Materialize  (cost=0.00..82.00 rows=2400 width=4)
>>>                ->   Seq Scan on b  (cost=0.00..34.00 rows=2400 width=4)
>>>    ->   Materialize  (cost=0.00..82.00 rows=2400 width=0)
>>>          ->   Seq Scan on c  (cost=0.00..34.00 rows=2400 width=0)
>>> (7 rows)
>>
>> It doesn't surprise me. Tables are empty, thus get a default non-0 row
>> estimate, which happens to be 2400:
>>
>
> I though so default estimate is used only when table wasn't analysed.
> But you have a true. I am verifying it on 8.3 and the behave is same.

Not quite totally the same. I got:

QUERY PLAN
------------------------------------------------------------------
Nested Loop (cost=36.40..276619270.40 rows=13824000000 width=8)
-> Nested Loop (cost=0.00..139234.00 rows=5760000 width=4)
-> Seq Scan on a (cost=0.00..34.00 rows=2400 width=4)
-> Seq Scan on c (cost=0.00..34.00 rows=2400 width=0)
-> Materialize (cost=36.40..60.40 rows=2400 width=4)
-> Seq Scan on b (cost=0.00..34.00 rows=2400 width=4)

Tom made some changes for 8.5 that will result in materialization
being used in more places, and I think we're seeing that here. The
planner thinks that materializing the inner side of the nestloop will
save it from going to disk for every iteration, but that's not really
true. b will be fully cached anyway, but the planner doesn't know
that. I think we need to think about this a little more before we let
this code out into the wild, or we'll get complaints about materialize
nodes being inserted in places where they only slow things down...

Mind you, it's not totally obvious to me what the solution is.

...Robert

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2010-01-12 18:21:48 Re: planner or statistical bug on 8.5
Previous Message Kevin Grittner 2010-01-12 18:05:02 Re: damage control mode