DISTINCT vs GROUP BY - was Re: is (not) distinct from

From: George Neuner <gneuner2(at)comcast(dot)net>
To: pgsql-general(at)postgresql(dot)org
Subject: DISTINCT vs GROUP BY - was Re: is (not) distinct from
Date: 2017-03-03 05:26:17
Message-ID: g7rhbcpnpn4jqokptb3tvc9vl7j13q6c1k@4ax.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Wed, 01 Mar 2017 11:12:29 -0500, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
wrote:

>This is a great example of "select distinct" being used as a band-aid
>over a fundamental misunderstanding of SQL. It's good advice to never use
>"distinct" unless you know exactly why your query is generating duplicate
>rows in the first place.

On that note:

I know most people here don't pay much - or any - attention to
SQLServer, however there was an interesting article recently regarding
significant performance differences between DISTINCT and GROUP BY as
used to remove duplicates.

https://sqlperformance.com/2017/01/t-sql-queries/surprises-assumptions-group-by-distinct

Now I'm wondering if something similar might be lurking in Postgresql?

[Yeah, I know - test it and find out!

Thing is, the queries used in the article are not simple. Although
not explicitly stated, it hints that - at least for SQLServer - a
simple case involving a string column is probably insufficient, and
complex scenarios are required to produce significant differences.
]

I'll get around to doing some testing soon. For now, I am just asking
if anyone has ever run into something like this?

George

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Alexander Farber 2017-03-03 07:32:00 Re: CentOS 7.3, PostgreSQL 9.6.2, PHP 5.4 deliver array as string
Previous Message rob stone 2017-03-03 02:40:31 Re: CentOS 7.3, PostgreSQL 9.6.2, PHP 5.4 deliver array as string