Re: selecting duplicate records

From: Christopher Browne <cbbrowne(at)acm(dot)org>
To: pgsql-sql(at)postgresql(dot)org
Subject: Re: selecting duplicate records
Date: 2003-09-23 03:13:23
Message-ID: m3r8282o4s.fsf@wolfe.cbbrowne.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

The world rejoiced as ch(at)rodos(dot)fzk(dot)de (Christoph Haller) wrote:
>> 1. How to select duplicate records only from a single table using a
> select
>> query.
>>
> e.g.
> select sid,count(sid) from location group by sid having count(sid)>1;
>
> Do you get the idea?
> Your request is pretty unspecific, so if this is not what you're asking
> for,
> try again.

The aggregate is likely to perform horrifically badly. Here might
be an option:

Step 1. Find all of the duplicates...

select a.* into temp table sid from some_table a, some_table b
where a.oid < b.oid and
a.field1 = b.field1 and
a.field2 = b.field2 and
a.field3 = b.field3 and
...
a.fieldn = b.fieldn;

Step 2. Look for the matching entries in the source table...

select a.* from some_table a, sid b
where
a.field1 = b.field1 and
a.field2 = b.field2 and
a.field3 = b.field3 and
...
a.fieldn = b.fieldn;

[There's a weakness here; if there are multiple dupes, they may get
picked multiple times in the second query :-(.]
--
If this was helpful, <http://svcs.affero.net/rm.php?r=cbbrowne> rate me
http://cbbrowne.com/info/rdbms.html
As Will Rogers would have said, "There is no such thing as a free
variable." -- Alan Perlis

In response to

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message Joe Conway 2003-09-23 03:59:27 Re: Oracle 'connect by prior' now eaiser in 7.3?
Previous Message Rudi Starcevic 2003-09-23 02:54:53 unsubscribe