Quick Links

Re: cost_hashjoin

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Greg Stark <gsstark(at)mit(dot)edu>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: cost_hashjoin
Date:	2010-08-30 13:49:11
Message-ID:	1283176151.1800.2360.camel@ebony
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Mon, 2010-08-30 at 13:34 +0100, Greg Stark wrote:
> On Mon, Aug 30, 2010 at 10:18 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> > cost_hashjoin() has some treatment of what occurs when numbatches > 1
> > but that additional cost is not proportional to numbatches.
>
> Because that's not how our hash batching works. We generate two temp
> files for each batch, one for the outer and one for the inner. So if
> we're batching then every tuple of both the inner and outer tables
> (except for ones in the first batch) need to be written once and read
> once regardless of the number of batches.

Thanks for explaining. For some reason I thought we rewound the outer at
the start of each batch, which is better for avoiding cache spoiling.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Training and Services

In response to

Re: cost_hashjoin at 2010-08-30 12:34:21 from Greg Stark

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Kevin Grittner	2010-08-30 14:32:14	Assertion failure on HEAD (or at least git copy of it)
Previous Message	Greg Stark	2010-08-30 12:34:21	Re: cost_hashjoin