Quick Links

Re: Optimizing DISTINCT with LIMIT

From:	Gregory Stark <stark(at)enterprisedb(dot)com>
To:	tmp <skrald(at)amossen(dot)dk>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Optimizing DISTINCT with LIMIT
Date:	2008-12-04 14:09:34
Message-ID:	878wqw117l.fsf@oxford.xeocode.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

tmp <skrald(at)amossen(dot)dk> writes:

> If the input is "sufficiently" large and the LIMIT threshold "sufficiently"
> small, maintain the DISTINCT output by hashning while traversing the input and
> stop when the LIMIT threshold is reached. No sorting required and *at* *most*
> one read of input.

You mean like this?

postgres=# explain select distinct x from i limit 5;
QUERY PLAN
-------------------------------------------------------------------
Limit (cost=54.50..54.51 rows=1 width=304)
-> HashAggregate (cost=54.50..54.51 rows=1 width=304)
-> Seq Scan on i (cost=0.00..52.00 rows=1000 width=304)
(3 rows)

This will be in the upcoming 8.4 release.

Versions since about 7.4 or so have been capable of producing this plan but
not for DISTINCT, only for the equivalent GROUP BY query:

postgres=# explain select x from i group by x limit 5;

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's 24x7 Postgres support!

In response to

Optimizing DISTINCT with LIMIT at 2008-12-04 13:32:51 from tmp

Responses

Re: Optimizing DISTINCT with LIMIT at 2008-12-04 14:20:17 from Heikki Linnakangas

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Heikki Linnakangas	2008-12-04 14:20:17	Re: Optimizing DISTINCT with LIMIT
Previous Message	Magnus Hagander	2008-12-04 14:08:13	Re: Re: pgsql: Properly unregister OpenSSL callbacks when libpq is done with