Re: Why is 8.4 and 9.0 so much slower on some queries?

From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Uwe Schroeder <uwe(at)oss4u(dot)com>
Cc: pgsql-general(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Why is 8.4 and 9.0 so much slower on some queries?
Date: 2011-04-12 13:34:42
Message-ID: BANLkTikDYnA9hZEqQViQ6Cp9Mpa_pDt1=A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Tue, Apr 12, 2011 at 12:58 AM, Uwe Schroeder <uwe(at)oss4u(dot)com> wrote:
>
>
>> Uwe Schroeder <uwe(at)oss4u(dot)com> writes:
>> > I have a 8.3 database and decided for various reasons to upgrade to 8.4.
>> > I also tried 9.0 - same results. On the exactly same hardware with the
>> > exactly same configuration, some queries perform a factor of 100000
>> > slower on 8.4 and 9.0 than on 8.3
>>
>> Hmm ... the problem here seems to come from the much poorer size
>>
>> estimate from the t/ugt antijoin:
>> >                ->  Hash Left Join  (cost=71.37..2445.93 rows=52 width=9)
>> >                (actual time=0.791..2.017 rows=104 loops=1)
>> >
>> >                      Hash Cond: ((subplan) = ugt.user_id)
>> >                      Filter: (ugt.group_id IS NULL)
>> >
>> >                ->  Hash Anti Join  (cost=62.53..2432.89 rows=1 width=9)
>> >                (actual time=0.639..2.976 rows=105 loops=1)
>> >
>> >                      Hash Cond: ((SubPlan 2) = ugt.user_id)
>>
>> Now, 8.3 has no idea what it's doing here, and it's just luck that it
>> produces an estimate that's only a factor of 2 off from reality.  8.4
>> actually understands that it's dealing with an antijoin, and with that
>> greater knowledge it produces ... a much worse estimate :-(.  And that
>> sends it into the weeds about how to perform the higher-level join.
>>
>> While I don't have your specific example to try, I did some
>> experimenting with queries of this form, and I noticed that 8.4's
>> heuristic in eqjoinsel_semi() was going completely nuts and estimating
>> that all rows in the lefthand side have join partners (thus, no rows out
>> of the antijoin).  This is because it has stats for one side of the
>> comparison operator but not the other side (the one with the
>> sub-select).  But it's taking the totally-made-up ndistinct estimate for
>> the sub-select at face value.  It needs to be a bit warier I think.
>>
>> In general you might want to think about whether you can recast this
>> query to avoid the sub-selects in the join conditions.  The planner
>> isn't terribly bright about sub-selects in the first place, and you're
>> putting them in places where quality estimates are really critical to
>> getting a good plan.
>
>
> Ahhh, that actually makes sense. Since this example isn't the only spot where
> I have this kind of query (I tweaked those for weeks to run fast, but always
> on a 8.3 server), I'll just downgrade to a 8.3 for production.
>
> If you're interested, I can provide database and data for you to toy with as
> there's nothing particularly secretive in that data except email addresses
> which can be erased as they're not needed for anything.

Tom already checked in a fix -- if you are able to build from source
you should be able to test it yourself. 9.0.4 and 8.4.8 will contain
the fix. In the short term, you can get it from git (follow
instructions here): http://wiki.postgresql.org/wiki/Working_with_Git.
Build/install instructions here:
http://www.postgresql.org/docs/9.0/interactive/installation.html.

you will want the branch REL8_4_STABLE or REL9_0_STABLE.

merlin

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Merlin Moncure 2011-04-12 13:57:45 Re: PostgreSQL backend process high memory usage issue
Previous Message Armin Nesiren 2011-04-12 11:14:38 full text search for words with a few specific chars