Quick Links

Re: is (not) distinct from

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
Cc:	Johann Spies <johann(dot)spies(at)gmail(dot)com>, "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject:	Re: is (not) distinct from
Date:	2017-03-01 16:12:29
Message-ID:	1318.1488384749@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com> writes:
> Where I am going with this, is that it is not clear to me how you are
> matching the two sets of records to determine whether they are different
> or not.

He's not. The query is forming the cartesian product of the two tables
and then dropping join rows where the tables match ... but every B row is
going to have multiple A rows where it doesn't match, and those join rows
will all survive the WHERE. Then "select distinct" gets rid of the
duplicates, and since nothing from A is presented in the result, it's not
very obvious what's happening.

This is a great example of "select distinct" being used as a band-aid
over a fundamental misunderstanding of SQL. It's good advice to never use
"distinct" unless you know exactly why your query is generating duplicate
rows in the first place.

regards, tom lane

In response to

Re: is (not) distinct from at 2017-03-01 15:18:52 from Adrian Klaver

Responses

DISTINCT vs GROUP BY - was Re: is (not) distinct from at 2017-03-03 05:26:17 from George Neuner
Re: is (not) distinct from at 2017-03-07 09:22:31 from Johann Spies

Browse pgsql-general by date

	From	Date	Subject
Next Message	jonathan vanasco	2017-03-01 16:39:43	appropriate column for storing ipv4 address
Previous Message	Adrian Klaver	2017-03-01 15:18:52	Re: is (not) distinct from