Skip site navigation (1) Skip section navigation (2)

Re: Hash or merge join instead of inner loop

From: "Jim C(dot) Nasby" <jim(at)nasby(dot)net>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: Hash or merge join instead of inner loop
Date: 2003-06-10 19:42:07
Message-ID: 20030610194206.GK40542@flake.decibel.org (view raw or flat)
Thread:
Lists: pgsql-performance
On Tue, Jun 10, 2003 at 02:15:11AM -0400, Tom Lane wrote:
> "Jim C. Nasby" <jim(at)nasby(dot)net> writes:
> > I have a query that's cauing pgsql choose either a hash or merge join
> > depending on how I mess with the stats variables, but it won't choose an
> > nested loop, even though it's the fastest.
> 
> There's been some discussion about that before; you could check the
> archives (now that they're up again ;-)).  I believe that the planner
> overestimates the cost of a nestloop with inner indexscan, because it
> costs the indexscans as though each one is an independent ab-initio
> index search.  In reality, most of the upper btree levels will no doubt
> stay in memory during such a query, and so this estimate charges many
> more reads than really occur.  Fixing this is on the todo list, but no
> one's got to it yet.  (It's not clear to me how to put the consideration
> into the planner's cost algorithms in a clean way.)
 
What about just ignoring all but the leaf pages? Unless you have a
really, really big index, I think this would probably work well, or at
least better than what we have right now.

I can't think of an elegant way to figure out hit percentages either.
Maybe as a ratio of how often an individual page at a given level of the
btree is to be hit? IE: the root page will always be hit (only one
page); if the next level up has 10 pages, each one is 10% likely to be
in cache, and so-on. Or maybe a better way to look at it is how many
pages sit underneath each page. So if we figure there's a 0.1% chance that
a leaf page is in cache and each page in the layer above/below that has
tuples for 100 leaf pages, then the odds of a page in that layer being
in the cache is 10%

It might also be worth giving index pages a higher priority in the
internal buffer than table pages.
-- 
Jim C. Nasby (aka Decibel!)                    jim(at)nasby(dot)net
Member: Triangle Fraternity, Sports Car Club of America
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"

In response to

Responses

pgsql-performance by date

Next:From: Tom LaneDate: 2003-06-10 19:43:47
Subject: Re: FW: [ADMIN] Shared_buffers and kernel parameters, tuning
Previous:From: Bruce MomjianDate: 2003-06-10 19:08:22
Subject: Re: FW: [ADMIN] Shared_buffers and kernel parameters, tuning

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group