Re: Why is 8.4 and 9.0 so much slower on some queries?

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Uwe Schroeder <uwe(at)oss4u(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Why is 8.4 and 9.0 so much slower on some queries?
Date: 2011-04-12 04:57:22
Message-ID: 16745.1302584242@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Uwe Schroeder <uwe(at)oss4u(dot)com> writes:
> I have a 8.3 database and decided for various reasons to upgrade to 8.4. I also tried 9.0 - same results.
> On the exactly same hardware with the exactly same configuration, some queries perform a factor of 100000 slower on 8.4 and 9.0 than on 8.3

Hmm ... the problem here seems to come from the much poorer size
estimate from the t/ugt antijoin:

> -> Hash Left Join (cost=71.37..2445.93 rows=52 width=9) (actual time=0.791..2.017 rows=104 loops=1)
> Hash Cond: ((subplan) = ugt.user_id)
> Filter: (ugt.group_id IS NULL)

> -> Hash Anti Join (cost=62.53..2432.89 rows=1 width=9) (actual time=0.639..2.976 rows=105 loops=1)
> Hash Cond: ((SubPlan 2) = ugt.user_id)

Now, 8.3 has no idea what it's doing here, and it's just luck that it
produces an estimate that's only a factor of 2 off from reality. 8.4
actually understands that it's dealing with an antijoin, and with that
greater knowledge it produces ... a much worse estimate :-(. And that
sends it into the weeds about how to perform the higher-level join.

While I don't have your specific example to try, I did some
experimenting with queries of this form, and I noticed that 8.4's
heuristic in eqjoinsel_semi() was going completely nuts and estimating
that all rows in the lefthand side have join partners (thus, no rows out
of the antijoin). This is because it has stats for one side of the
comparison operator but not the other side (the one with the
sub-select). But it's taking the totally-made-up ndistinct estimate for
the sub-select at face value. It needs to be a bit warier I think.

In general you might want to think about whether you can recast this
query to avoid the sub-selects in the join conditions. The planner
isn't terribly bright about sub-selects in the first place, and you're
putting them in places where quality estimates are really critical to
getting a good plan.

regards, tom lane

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Arash pajoohande 2011-04-12 05:32:25 Re: what data type to store fixed size integer?
Previous Message Uwe Schroeder 2011-04-12 03:29:13 Why is 8.4 and 9.0 so much slower on some queries?