The patch originally modified the cost function but I removed that
part before we submitted it to be a bit conservative about our
proposed changes. I didn't like that for large plans the statistics
were retrieved and calculated many times when finding the optimal
The overhead of the algorithm when the skew optimization is not used
ends up being roughly a function call and an if statement per tuple.
It would be easy to remove the function call per tuple. Dr. Lawrence
has come up with some changes so that when the optimization is turned
off, the function call does not happen at all and instead of the if
statement happening per tuple it is run just once per join. We have
to test this a bit more but it should further reduce the overhead.
Hopefully we will have the new patch ready to go this weekend.
- Bryce Cutt
On Thu, Feb 26, 2009 at 7:45 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Heikki's got a point here: the planner is aware that hashjoin doesn't
> like skewed distributions, and it assigns extra cost accordingly if it
> can determine that the join key is skewed. (See the "bucketsize" stuff
> in cost_hashjoin.) If this patch is accepted we'll want to tweak that
> Still, that has little to do with the current gating issue, which is
> whether we've convinced ourselves that the patch doesn't cause a
> performance decrease for cases in which it's unable to help.
> regards, tom lane
In response to
pgsql-hackers by date
|Next:||From: Andrew Dunstan||Date: 2009-02-26 20:18:45|
|Subject: Re: xpath processing brain dead|
|Previous:||From: Heikki Linnakangas||Date: 2009-02-26 19:59:05|
|Subject: Re: Hot standby, recovery infra|