Re: Optimizing DISTINCT with LIMIT

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: Gregory Stark <stark(at)enterprisedb(dot)com>, tmp <skrald(at)amossen(dot)dk>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Optimizing DISTINCT with LIMIT
Date: 2008-12-04 14:35:26
Message-ID: 24128.1228401326@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com> writes:
> Gregory Stark wrote:
>> You mean like this?
>>
>> postgres=# explain select distinct x from i limit 5;
>> QUERY PLAN
>> -------------------------------------------------------------------
>> Limit (cost=54.50..54.51 rows=1 width=304)
>> -> HashAggregate (cost=54.50..54.51 rows=1 width=304)
>> -> Seq Scan on i (cost=0.00..52.00 rows=1000 width=304)
>> (3 rows)

> Does that know to stop scanning as soon as it has seen 5 distinct values?

In principle, if there are no aggregate functions, then nodeAgg could
return a row immediately upon making any new entry into the hash table.
Whether it's worth the code uglification is debatable ... I think it
would require a third major pathway through nodeAgg.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2008-12-04 14:51:28 Re: snapshot leak and core dump with serializable transactions
Previous Message Gregory Stark 2008-12-04 14:32:07 Re: Optimizing DISTINCT with LIMIT