Re: How slow is DISTINCT?

From: "Josh Berkus" <josh(at)agliodbs(dot)com>
To: wweng(at)kencast(dot)com, pgsql-sql(at)postgresql(dot)org
Subject: Re: How slow is DISTINCT?
Date: 2002-02-27 21:45:16
Message-ID: web-812107@davinci.ethosmedia.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

Wei Wang,

> How exactly slow is DISTINCT being processed in SQL engines? (not
> limited to postgresql, though comments on postgresql would be most
> relevant)

I can only give you a relative result, based exlusively on my anecdotal
experience with 7.1:

Fast: SELECT ...
Slower: SELECT ... GROUP BY x,y,z
or: SELECT DISCTINCT ON (x) ... (Postgres non-standard extension)
SLowest: SELECT DISTINCT ...

The reason for this is that SELECT DISTINCT is effectively a GROUP BY
on all result fields of the query, and if a few of the aren't indexed
that requires a seq scan.

If performance is an issue, you may wish to consider restructuring your
queries and/or data model to eliminate the actual duplicate rows.

-Josh

In response to

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message Josh Berkus 2002-02-27 22:54:39 Re: How slow is DISTINCT?
Previous Message Dan Langille 2002-02-27 21:30:58 Re: How slow is DISTINCT?