From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com> |
Cc: | Johann Spies <johann(dot)spies(at)gmail(dot)com>, "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: is (not) distinct from |
Date: | 2017-03-01 16:12:29 |
Message-ID: | 1318.1488384749@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com> writes:
> Where I am going with this, is that it is not clear to me how you are
> matching the two sets of records to determine whether they are different
> or not.
He's not. The query is forming the cartesian product of the two tables
and then dropping join rows where the tables match ... but every B row is
going to have multiple A rows where it doesn't match, and those join rows
will all survive the WHERE. Then "select distinct" gets rid of the
duplicates, and since nothing from A is presented in the result, it's not
very obvious what's happening.
This is a great example of "select distinct" being used as a band-aid
over a fundamental misunderstanding of SQL. It's good advice to never use
"distinct" unless you know exactly why your query is generating duplicate
rows in the first place.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | jonathan vanasco | 2017-03-01 16:39:43 | appropriate column for storing ipv4 address |
Previous Message | Adrian Klaver | 2017-03-01 15:18:52 | Re: is (not) distinct from |