Re: Correcting Hash Join Estimates

From: Mark Lubratt <mark(dot)lubratt(at)indeq(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: Correcting Hash Join Estimates
Date: 2005-04-04 06:15:38
Message-ID: 3de3144f9eb6c83400f8d41584af47f2@indeq.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance


On Apr 4, 2005, at 12:54 AM, Tom Lane wrote:

> mark(dot)lubratt(at)indeq(dot)com writes:
>> I'm trying to optimize a query and the EXPLAIN ANALYZE (see link
>> below)
>> shows that some hash join row estimates are wrong by a factor of 2-3,
>> and upwards of 7-8.
>
> I doubt that improving those estimates would lead to markedly better
> results. You need to think about improving the view design instead.
> What context is this view used in --- do you just do "select * from
> view_get_all_user_award2", or are there conditions added to it, or
> perhaps it gets joined with other things?

Yes. I forgot to show how the query is executed...

select * from view_get_all_user_award2 where person_id = 1;

> Do you really need the
> DISTINCT constraint?

Yes.

> Do you really need the ORDER BY?

The customer wants an initial ordering in the displayed data.

> Can you
> simplify the WHERE clause at all?
>

I originally had a bunch of LEFT JOINs. After reading Tow's "SQL
Tuning", I was hoping to steer the planner into a more "optimal" plan
by using a large where clause instead and doing the joins there (I
think they're called implicit joins). I was able to shave a couple of
hundred milliseconds off the execution time by doing this.

> Half a second sounds pretty decent to me for a ten-way join with a
> WHERE
> clause as unstructured as that. If you really need it to execute in
> way
> less time, you're probably going to have to rethink your data
> representation to make the query simpler.
>

Unfortunately, I'm not sure I can restructure the data. I did consider
materialized views. However, they couldn't be lazy and that seemed
like a lot of extra work for the backend for very little improvement.

If this sounds like decent performance to you... I guess I can just
tell the complainers that it's as good as it's going to get (barring a
major hardware upgrade...).

Thanks!
Mark

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message bsimon 2005-04-04 08:02:22 Postgresql vs SQLserver for this application ?
Previous Message Tom Lane 2005-04-04 05:54:17 Re: Correcting Hash Join Estimates