Skip site navigation (1) Skip section navigation (2)

Re: JOIN producing duplicate results

From: Brent Dombrowski <brent(dot)dombrowski(at)gmail(dot)com>
To: Lonni J Friedman <netllama(at)gmail(dot)com>
Cc: pgsql-novice(at)postgresql(dot)org
Subject: Re: JOIN producing duplicate results
Date: 2012-05-03 13:22:49
Message-ID: 4055225C-9115-4088-B3FD-2C5EEA7BCB57@gmail.com (view raw or flat)
Thread:
Lists: pgsql-novice
You are not using a.id in the join. My guess is that all the other columns in a that you are joining against are not unique and that is where the duplicates are coming from.

Brent.

On May 2, 2012, at 4:51 PM, Lonni J Friedman wrote:

> I've got a query that is joining data across 4 tables to provide data
> based on test results.  The query is working fine, except for the fact
> that its returning two identical records for each row of unique data.
> If I throw a DISTINCT in front of the primary key column (a.id) of
> one of the tables in the join, that eliminates all the duplicates.
> However, I've read (and found) that DISTINCT tends to introduce a
> performance hit, so I'm hoping to find a better performing solution,
> if possible.  Hopefully I'm just doing something silly with my JOINS
> that is easily fixed.  This is on postgresql-9.0.x, and yes I'm aware
> that if i upgraded to 9.1.x then I could likely do a 'group by a.id',
> but for now I'm stuck on 9.0.x.
> 
> Here's the query:
> SELECT a.id,a.suiteid,a.testname,date_trunc('second',a.last_update) AS
> last_update,regexp_replace(p.relname,E'tests','','g'),o.osname
> FROM smoketests AS a, pg_class AS p, smoke AS t, osversmap AS o
> WHERE a.osversion=o.osversion AND a.suiteid=t.id AND a.tableoid=p.oid
> AND ( a.current_status='FAILED' ) AND ( a.arch='i386' ) AND (
> a.os='Darwin' ) AND a.last_update>'2012-05-01 04:00:00' AND
> a.last_update<'2012-05-02 14:20:45'
> ORDER BY a.id ;
> 
>    id    | suiteid |     testname     |     last_update     |
> regexp_replace |   osname
> ----------+---------+------------------+---------------------+----------------+------------
> 32549818 |  668232 | bug377064        | 2012-05-01 08:38:07 | smoke
>   | OSX-10.7.x
> 32549818 |  668232 | bug377064        | 2012-05-01 08:38:07 | smoke
>   | OSX-10.7.x
> 32549819 |  668232 | funcmem_resize   | 2012-05-01 08:38:07 | smoke
>   | OSX-10.7.x
> 32549819 |  668232 | funcmem_resize   | 2012-05-01 08:38:07 | smoke
>   | OSX-10.7.x
> 32549820 |  668232 | leitest | 2012-05-01 08:38:07 | smoke      | OSX-10.7.x
> 32549820 |  668232 | leitest | 2012-05-01 08:38:07 | smoke      | OSX-10.7.x
> 
> The problem is visible in the id column, where there are two of each
> value returned even though a.id is the unique primary key of the smoke
> table and doesn't really have duplicates.  The 'smoke' table has a one
> to many relationship with the smoketests table, but I'm still rather
> confused why I'm getting the duplicates of everything.
> 
> -- 
> Sent via pgsql-novice mailing list (pgsql-novice(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-novice


In response to

Responses

pgsql-novice by date

Next:From: pg noobDate: 2012-05-03 13:50:28
Subject: how to tell if a tuple has been inserted in current transaction
Previous:From: Merlin MoncureDate: 2012-05-03 13:22:41
Subject: Re: REGARDING ADDING A DATATYPE LIKE POLYGON

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group