Quick Links

Re: TB-sized databases

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Matthew <matthew(at)flymine(dot)org>
Cc:	pgsql-performance(at)postgresql(dot)org
Subject:	Re: TB-sized databases
Date:	2007-12-06 17:55:38
Message-ID:	17456.1196963738@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

Matthew <matthew(at)flymine(dot)org> writes:
> ... For this query, Postgres would perform a nested loop,
> iterating over all rows in the small table, and doing a hundred index
> lookups in the big table. This completed very quickly. However, adding the
> LIMIT meant that suddenly a merge join was very attractive to the planner,
> as it estimated the first row to be returned within milliseconds, without
> needing to sort either table.

> The problem is that Postgres didn't know that the first hit in the big
> table would be about half-way through, after doing a index sequential scan
> for half a bazillion rows.

Hmm. IIRC, there are smarts in there about whether a mergejoin can
terminate early because of disparate ranges of the two join variables.
Seems like it should be straightforward to fix it to also consider
whether the time-to-return-first-row will be bloated because of
disparate ranges. I'll take a look --- but it's probably too late
to consider this for 8.3.

regards, tom lane

In response to

Re: TB-sized databases at 2007-12-06 17:46:35 from Matthew

Responses

Re: TB-sized databases at 2007-12-06 18:03:09 from Matthew
Re: TB-sized databases at 2007-12-07 01:55:02 from Tom Lane

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Matthew	2007-12-06 18:03:09	Re: TB-sized databases
Previous Message	Matthew	2007-12-06 17:46:35	Re: TB-sized databases