Re: performance drop on 8.2.4, reverting to 8.1.4

From: "Steven Flatt" <steven(dot)flatt(at)gmail(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Steinar H(dot) Gunderson" <sgunderson(at)bigfoot(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: performance drop on 8.2.4, reverting to 8.1.4
Date: 2007-06-05 21:30:14
Message-ID: 357fa7590706051430h5db6760dx9d4a46c7a8a36205@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 5/18/07, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
>
> Yeah. 8.2 is estimating that the "nodeid IS NULL" condition will
> discard all or nearly all the rows, presumably because there aren't any
> null nodeid's in the underlying table --- it fails to consider that the
> LEFT JOIN may inject some nulls. 8.1 was not any brighter; the reason
> it gets a different estimate is that it doesn't distinguish left-join
> and WHERE clauses at all, but assumes that the result of the left join
> can't have fewer rows than its left input, even after applying the
> filter condition. In this particular scenario that happens to be a
> better estimate. So even though 8.2 is smarter, and there is no bug
> here that wasn't in 8.1 too, it's getting a worse estimate leading to
> a worse plan.
>
> This is a sufficiently common idiom that I think it's a must-fix
> problem. Not sure about details yet, but it seems somehow the
> selectivity estimator had better start accounting for
> outer-join-injected NULLs.
>
This problem is causing us a bit of grief as we plan to move from 8.1.4 to
8.2.4. We have many (on the order of a hundred) queries that are of the
form:

(A) LEFT JOIN (B) ON col WHERE B.col IS NULL

These queries are much slower on 8.2 than on 8.1 for what looks like the
reason outlined above. I have rewritten a few key queries to be of the
equivalent form:

(A) WHERE col NOT IN (SELECT col FROM (B))

which has resulted in a dramatic improvement. I'm really hoping that I'm
not going to need to re-write every single one of our queries that are of
the first form above. Is there any estimation as to if/when the fix will
become available? I'm hoping this isn't going to be a showstopper in us
moving to 8.2.

Thanks,
Steve

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Steinar H. Gunderson 2007-06-05 21:38:47 Re: performance drop on 8.2.4, reverting to 8.1.4
Previous Message david 2007-06-05 20:33:23 Re: Thousands of tables versus on table?