Re: Optimizing DISTINCT with LIMIT

From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Gregory Stark <stark(at)enterprisedb(dot)com>
Cc: tmp <skrald(at)amossen(dot)dk>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Optimizing DISTINCT with LIMIT
Date: 2008-12-04 14:20:17
Message-ID: 4937E721.2050407@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Gregory Stark wrote:
> tmp <skrald(at)amossen(dot)dk> writes:
>
>> If the input is "sufficiently" large and the LIMIT threshold "sufficiently"
>> small, maintain the DISTINCT output by hashning while traversing the input and
>> stop when the LIMIT threshold is reached. No sorting required and *at* *most*
>> one read of input.
>
> You mean like this?
>
> postgres=# explain select distinct x from i limit 5;
> QUERY PLAN
> -------------------------------------------------------------------
> Limit (cost=54.50..54.51 rows=1 width=304)
> -> HashAggregate (cost=54.50..54.51 rows=1 width=304)
> -> Seq Scan on i (cost=0.00..52.00 rows=1000 width=304)
> (3 rows)

Does that know to stop scanning as soon as it has seen 5 distinct values?

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Gregory Stark 2008-12-04 14:32:07 Re: Optimizing DISTINCT with LIMIT
Previous Message Gregory Stark 2008-12-04 14:09:34 Re: Optimizing DISTINCT with LIMIT