Re: is (not) distinct from

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
Cc: Johann Spies <johann(dot)spies(at)gmail(dot)com>, "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: is (not) distinct from
Date: 2017-03-01 16:12:29
Message-ID: 1318.1488384749@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com> writes:
> Where I am going with this, is that it is not clear to me how you are
> matching the two sets of records to determine whether they are different
> or not.

He's not. The query is forming the cartesian product of the two tables
and then dropping join rows where the tables match ... but every B row is
going to have multiple A rows where it doesn't match, and those join rows
will all survive the WHERE. Then "select distinct" gets rid of the
duplicates, and since nothing from A is presented in the result, it's not
very obvious what's happening.

This is a great example of "select distinct" being used as a band-aid
over a fundamental misunderstanding of SQL. It's good advice to never use
"distinct" unless you know exactly why your query is generating duplicate
rows in the first place.

regards, tom lane

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message jonathan vanasco 2017-03-01 16:39:43 appropriate column for storing ipv4 address
Previous Message Adrian Klaver 2017-03-01 15:18:52 Re: is (not) distinct from