Re: Very slow left outer join

From: "Tyrrill, Ed" <tyrrill_ed(at)emc(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Klint Gore" <kg(at)kgb(dot)une(dot)edu(dot)au>
Cc: <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Very slow left outer join
Date: 2007-05-30 16:55:10
Message-ID: A23190A408F7094FAF446C1538222F7603F98161@avaexch01.avamar.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:
> Klint Gore <kg(at)kgb(dot)une(dot)edu(dot)au> writes:
>> On Tue, 29 May 2007 17:16:57 -0700, "Tyrrill, Ed"
<tyrrill_ed(at)emc(dot)com> wrote:
>>> mdsdb=# explain analyze select backupobjects.record_id from
>>> backupobjects left outer join backup_location using(record_id) where
>>> backup_id = 1071;
>
>> Why are you using left join?
>> The where condition is going to force the row to exist.

This select is a simplified version of what I am really doing that still
exhibits the problem I am having. I know this small query doesn't
really make sense, but I thought it would be easier to evaluate
something small rather then the entire query.

>
> Which indeed the planner figured out (note the lack of any mention of
> left join in the EXPLAIN result). Michael put his finger on the
problem
> though: there's something way off about the rowcount estimate here:
>
>> -> Bitmap Heap Scan on backup_location (cost=3831.20..360207.21
>> rows=436915 width=8) (actual time=94.375..97.688 rows=2789 loops=1)
>> Recheck Cond: (backup_id = 1071)
>> -> Bitmap Index Scan on backup_location_bid
>> (cost=0.00..3831.20 rows=436915 width=0) (actual time=84.239..84.239
>> rows=2789 loops=1)
>> Index Cond: (backup_id = 1071)
>
> With such a simple index condition the planner really ought to be able
> to come close to the right rowcount estimate. Check for vacuuming
> problems, check for lack of ANALYZE, consider whether you need to bump
> up the statistics target ...
>
> regards, tom lane

I did a vacuum analyze after inserting all the data. Is there possibly
a bug in analyze in 8.1.5-6? I know it says rows=436915, but the last
time the backup_location table has had that little data in it was a
couple months ago, and analyze has been run many times since then.
Currently it has over 160 million rows.

Thanks,
Ed

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2007-05-30 16:59:11 Re: Very slow left outer join
Previous Message Tyrrill, Ed 2007-05-30 16:22:46 Re: Very slow left outer join