Skip site navigation (1) Skip section navigation (2)

Re: hashjoin chosen over 1000x faster plan

From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: hashjoin chosen over 1000x faster plan
Date: 2007-10-10 17:05:55
Message-ID: 1192035955.4233.306.camel@ebony.site (view raw or flat)
Thread:
Lists: pgsql-performance
On Wed, 2007-10-10 at 09:15 -0500, Kevin Grittner wrote:
> >>> On Wed, Oct 10, 2007 at  1:31 AM, in message
> <1191997904(dot)4233(dot)125(dot)camel(at)ebony(dot)site>, Simon Riggs <simon(at)2ndquadrant(dot)com>
> wrote: 
> > On Tue, 2007-10-09 at 15:09 -0500, Kevin Grittner wrote:
> > 
> >> I have a situation where a query is running much slower than I would
> >> expect.  The ANALYZE showed that it is hashing some information which
> >> is rarely needed.  When I set enable_hashjoin = off for the
> >> connection the query run in 1/1000 the time.
> > 
> > Can you confirm the two queries give identical outputs?
>  
> I checked; the output is identical.
>  
> > It isn't clear
> > to me why the second sort is (never executed) in your second plan, which
> > I would only expect to see for an inner merge join.
>  
> I assume that is because there were no rows to sort.  The
> CaseTypeHistEvent view is only needed if there is a link to an event
> which reopens the charge after it is disposed.  This only happens for
> about 1% of the Charge records.

So CHST.EventType is mostly NULL? So the good news is that the default
plan works best when it does actually find a match. So for 1% of cases
you will have an execution time of about 1s, <1ms for the others if you
fiddle with the planner methods.

The planner thinks every row will find a match, yet the actual number is
only 1%. Hmmm, same section of code as last week.

Basically the planner doesn't ever optimise for the possibility of the
never-executed case because even a single row returned would destroy
that assumption. 

If we had an Option node in there, we could run the first part of the
plan before deciding whether to do an MJ or an HJ. Doing that would
avoid doing 2 sorts and return even quicker in the common case (about
80% time) without being slower in the slowest.

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com


In response to

Responses

pgsql-performance by date

Next:From: Josh TrutwinDate: 2007-10-10 17:25:31
Subject: Re: SQL Monitoring
Previous:From: Scott MarloweDate: 2007-10-10 15:20:02
Subject: Re: Shared Buffer setting in postgresql.conf

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group