Quick Links

Re: TB-sized databases

From:	Matthew <matthew(at)flymine(dot)org>
To:	pgsql-performance(at)postgresql(dot)org
Subject:	Re: TB-sized databases
Date:	2007-12-06 18:03:09
Message-ID:	Pine.LNX.4.58.0712061758360.3731@aragorn.flymine.org
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

On Thu, 6 Dec 2007, Tom Lane wrote:
> Matthew <matthew(at)flymine(dot)org> writes:
> > ... For this query, Postgres would perform a nested loop,
> > iterating over all rows in the small table, and doing a hundred index
> > lookups in the big table. This completed very quickly. However, adding the
> > LIMIT meant that suddenly a merge join was very attractive to the planner,
> > as it estimated the first row to be returned within milliseconds, without
> > needing to sort either table.
>
> > The problem is that Postgres didn't know that the first hit in the big
> > table would be about half-way through, after doing a index sequential scan
> > for half a bazillion rows.
>
> Hmm. IIRC, there are smarts in there about whether a mergejoin can
> terminate early because of disparate ranges of the two join variables.
> Seems like it should be straightforward to fix it to also consider
> whether the time-to-return-first-row will be bloated because of
> disparate ranges. I'll take a look --- but it's probably too late
> to consider this for 8.3.

Very cool. Would that be a planner cost estimate fix (so it avoids the
merge join), or a query execution fix (so it does the merge join on the
table subset)?

Matthew

--
I've run DOOM more in the last few days than I have the last few
months. I just love debugging ;-) -- Linus Torvalds

In response to

Re: TB-sized databases at 2007-12-06 17:55:38 from Tom Lane

Responses

Re: TB-sized databases at 2007-12-06 18:34:47 from Tom Lane

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Tom Lane	2007-12-06 18:34:47	Re: TB-sized databases
Previous Message	Tom Lane	2007-12-06 17:55:38	Re: TB-sized databases