Re: Why we don't want hints Was: Slow count(*) again...

From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: <sthomas(at)peak6(dot)com>
Cc: "Josh Berkus" <josh(at)agliodbs(dot)com>, "Robert Haas" <robertmhaas(at)gmail(dot)com>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Why we don't want hints Was: Slow count(*) again...
Date: 2011-02-10 17:44:29
Message-ID: 4D53CF9D020000250003A797@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-performance

Shaun Thomas <sthomas(at)peak6(dot)com> wrote:

> I was thinking more:
>
> JOIN foo_tab USING (foo_id) WITH (COST=50)

The problem I have with that syntax is that it would be hard to read
when you have some nested set of joins or a (SELECT) in the JOIN
instead of simple table name. For me, at least, it would "get lost"
less easily if it were right next to the JOIN keyword.

The problem with a COST factor is that it's not obvious to me what
it would apply to:
- each row on the left?
- each row on the right?
- each row in the result of the JOIN step?
- the entire step?

How would it scale based on other criteria which affected the number
of rows on either side of the join?

If I'm understanding the problem correctly, the part the optimizer
gets wrong (because we don't yet have statistics to support a better
assumption) is assuming that selection criteria on opposite sides of
a join affect entirely independent sets of what would be in the
result without the criteria. To use an oft-cited example, when one
table is selected by zip code and the other by city, that's a bad
assumption about the correlation, leading to bad estimates, leading
to bad costing, leading to bad plans. The OP wanted to override
step 4, a COST setting would try to override step 3, but I think we
would want to override step 1 (until we get statistics which let us
compute that accurately).

-Kevin

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2011-02-10 17:51:31 Re: Does auto-analyze work on dirty writes?
Previous Message Shaun Thomas 2011-02-10 17:30:46 Re: Why we don't want hints Was: Slow count(*) again...

Browse pgsql-performance by date

  From Date Subject
Next Message Robert Haas 2011-02-10 17:51:31 Re: Does auto-analyze work on dirty writes?
Previous Message Shaun Thomas 2011-02-10 17:30:46 Re: Why we don't want hints Was: Slow count(*) again...