From: | Brent Dombrowski <brent(dot)dombrowski(at)gmail(dot)com> |
---|---|
To: | Lonni J Friedman <netllama(at)gmail(dot)com> |
Cc: | pgsql-novice(at)postgresql(dot)org |
Subject: | Re: JOIN producing duplicate results |
Date: | 2012-05-03 13:22:49 |
Message-ID: | 4055225C-9115-4088-B3FD-2C5EEA7BCB57@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-novice |
You are not using a.id in the join. My guess is that all the other columns in a that you are joining against are not unique and that is where the duplicates are coming from.
Brent.
On May 2, 2012, at 4:51 PM, Lonni J Friedman wrote:
> I've got a query that is joining data across 4 tables to provide data
> based on test results. The query is working fine, except for the fact
> that its returning two identical records for each row of unique data.
> If I throw a DISTINCT in front of the primary key column (a.id) of
> one of the tables in the join, that eliminates all the duplicates.
> However, I've read (and found) that DISTINCT tends to introduce a
> performance hit, so I'm hoping to find a better performing solution,
> if possible. Hopefully I'm just doing something silly with my JOINS
> that is easily fixed. This is on postgresql-9.0.x, and yes I'm aware
> that if i upgraded to 9.1.x then I could likely do a 'group by a.id',
> but for now I'm stuck on 9.0.x.
>
> Here's the query:
> SELECT a.id,a.suiteid,a.testname,date_trunc('second',a.last_update) AS
> last_update,regexp_replace(p.relname,E'tests','','g'),o.osname
> FROM smoketests AS a, pg_class AS p, smoke AS t, osversmap AS o
> WHERE a.osversion=o.osversion AND a.suiteid=t.id AND a.tableoid=p.oid
> AND ( a.current_status='FAILED' ) AND ( a.arch='i386' ) AND (
> a.os='Darwin' ) AND a.last_update>'2012-05-01 04:00:00' AND
> a.last_update<'2012-05-02 14:20:45'
> ORDER BY a.id ;
>
> id | suiteid | testname | last_update |
> regexp_replace | osname
> ----------+---------+------------------+---------------------+----------------+------------
> 32549818 | 668232 | bug377064 | 2012-05-01 08:38:07 | smoke
> | OSX-10.7.x
> 32549818 | 668232 | bug377064 | 2012-05-01 08:38:07 | smoke
> | OSX-10.7.x
> 32549819 | 668232 | funcmem_resize | 2012-05-01 08:38:07 | smoke
> | OSX-10.7.x
> 32549819 | 668232 | funcmem_resize | 2012-05-01 08:38:07 | smoke
> | OSX-10.7.x
> 32549820 | 668232 | leitest | 2012-05-01 08:38:07 | smoke | OSX-10.7.x
> 32549820 | 668232 | leitest | 2012-05-01 08:38:07 | smoke | OSX-10.7.x
>
> The problem is visible in the id column, where there are two of each
> value returned even though a.id is the unique primary key of the smoke
> table and doesn't really have duplicates. The 'smoke' table has a one
> to many relationship with the smoketests table, but I'm still rather
> confused why I'm getting the duplicates of everything.
>
> --
> Sent via pgsql-novice mailing list (pgsql-novice(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-novice
From | Date | Subject | |
---|---|---|---|
Next Message | pg noob | 2012-05-03 13:50:28 | how to tell if a tuple has been inserted in current transaction |
Previous Message | Merlin Moncure | 2012-05-03 13:22:41 | Re: REGARDING ADDING A DATATYPE LIKE POLYGON |