Re: hashjoin chosen over 1000x faster plan

From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: pgsql-performance(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: hashjoin chosen over 1000x faster plan
Date: 2007-10-10 19:52:25
Message-ID: 1192045945.4233.351.camel@ebony.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Wed, 2007-10-10 at 14:35 -0500, Kevin Grittner wrote:
> >>> On Wed, Oct 10, 2007 at 1:54 PM, in message
> <1192042492(dot)4233(dot)334(dot)camel(at)ebony(dot)site>, Simon Riggs <simon(at)2ndquadrant(dot)com>
> wrote:
> >
> > But the planner doesn't work on probability. It works on a best-guess
> > selectivity, as known at planning time.
>
> The point I'm trying to make is that at planning time the
> pg_statistic row for this "Charge"."reopHistSeqNo" column showed
> stanullfrac as 0.989; it doesn't seem to have taken this into account
> when making its guess about how many rows would be joined when it was
> compared to the primary key column of the "CaseHist" table. I'm
> suggesting that it might be a good thing if it did.

Understood, it would be a good thing if it did.

It's more complex than you think:

The fast plan is an all-or-nothing plan. It is *only* faster when the
number of matched rows is zero. You know it is zero, but currently the
planner doesn't, nor is it able to make use of the information when it
has it, half thru execution. Even if we could work out the high
probability of it being zero, we would still be left with the decision
of whether to optimise for the zero or for the non-zero.

--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Jonah H. Harris 2007-10-10 19:55:07 Re: Performance problems with prepared statements
Previous Message Kevin Grittner 2007-10-10 19:35:58 Re: hashjoin chosen over 1000x faster plan