Quick Links

Re: Optimizing DISTINCT with LIMIT

From:	David Lee Lambert <davidl(at)lmert(dot)com>
To:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Optimizing DISTINCT with LIMIT
Date:	2008-12-06 11:29:05
Message-ID:	200812060629.08223.davidl@lmert.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Thursday 04 December 2008 15:09, Gregory Stark wrote:
> tmp <skrald(at)amossen(dot)dk> writes:

> > Also, it is my impression that many people use LIMIT to minimize the
> > evaluation time of sub queries from which the outer query only needs a
> > small subset of the sub query output.
>
> I've seen lots of queries which only pull a subset of the results too --
> but it's always a specific subset. So that means using ORDER BY or a WHERE
> clause to control it.

I use "ORDER BY random() LIMIT :some_small_number" frequently to get a "feel"
for data. That always builds the unrandomized relation and then sorts it. I
guess an alternate path for single-table queries would be to randomly choose
a block number and then a tuple number; but that would be biased toward long
rows (of which fewer can appear in a block).

--
David Lee Lambert ... Software Developer
Cell phone: +1 586-873-8813 ; alt. email <as4109(at)wayne(dot)edu> or
<lamber45(at)msu(dot)edu>
GPG key at http://www.lmert.com/keyring.txt

In response to

Re: Optimizing DISTINCT with LIMIT at 2008-12-04 20:09:57 from Gregory Stark

Responses

Re: Optimizing DISTINCT with LIMIT at 2008-12-06 18:08:56 from Grzegorz Jaskiewicz

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	hubert depesz lubaczewski	2008-12-06 11:53:53	visibility map - what do i miss?
Previous Message	Fujii Masao	2008-12-06 08:55:22	Re: Sync Rep: First Thoughts on Code