Re: duplicates

From: "Moray McConnachie" <moray(dot)mcconnachie(at)computing-services(dot)oxford(dot)ac(dot)uk>
To: "Allan Kelly" <allan(dot)kelly(at)buildstore(dot)co(dot)uk>, <pgsql-sql(at)postgresql(dot)org>
Subject: Re: duplicates
Date: 2000-04-07 12:05:05
Message-ID: 012301bfa089$82ffa920$760e01a3@oucs.ox.ac.uk
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-sql

----- Original Message -----
From: "Allan Kelly" <allan(dot)kelly(at)buildstore(dot)co(dot)uk>
To: <pgsql-sql(at)postgresql(dot)org>
Sent: Friday, April 07, 2000 11:34 AM
Subject: [SQL] duplicates

> I have a system bug which means that I have occasional dupicate entries in
> my 'subscribers' table. I'm trying to find a query which performs
something
> like a 'where user_name is not unique' query.
>
> At the moment we use this cludge:
>
> select count(*), user_name from subscribers
> group by user_name order by count;
>
> (I'm surprised I can't add 'where count > 1' but it seems conditionals on
> aggregate fields are not allowed).
>
> This gives me a very long list with the 'not unique' entries at the
bottom, eg
>
> count | user_name
> ------+------------------
> 1 | bill.hicks
> [ ..cut 9 zillion results.. ]
> 1 | margaret.thatcher
> 4 | linus.torvalds
> 9 | bill.gates
>
> I then have to do
>
> select oid from subscribers where user_name = 'linus.torvalds';
>
> and delete all but one of the records. Is there a better way to do this?

DELETE FROM subscribers WHERE EXISTS(SELECT x.username FROM subscribers x
WHERE subscribers.oid
<x.oid AND subscribers.username LIKE x.username);

ought to do it, I *think*. It will delete all but the most recently inserted
copy, i.e. that with the highest oid. Self-joins (which this is,
effectively), are often quicker and more efficient than counts.

Yours,
Moray

In response to

  • duplicates at 2000-04-07 10:34:01 from Allan Kelly

Browse pgsql-sql by date

  From Date Subject
Next Message Gerhard Dieringer 2000-04-07 12:54:02 Antw: duplicates
Previous Message Leonid P. Klemjatsionok 2000-04-07 12:04:18 Re[2]: update only if single row