Re: Removing duplicate entries

From: Michael Fuhr <mike(at)fuhr(dot)org>
To: Scott Ford <Scott(dot)Ford(at)bullfrogpower(dot)com>
Cc: pgsql-novice(at)postgresql(dot)org
Subject: Re: Removing duplicate entries
Date: 2006-01-11 20:18:04
Message-ID: 20060111201804.GA87587@winnie.fuhr.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-novice

On Wed, Jan 11, 2006 at 02:06:53PM -0500, Scott Ford wrote:
> customers
> customer_id
> ...
>
> documents
> customer_id
> document_id
> document_type_id
> ...
>
> So, for example, there are two documents with the same document_type_id
> associated with one customer.
>
> Can someone help me with a SQL statement that might help me remove the
> duplicate documents for a certain document_type_id?

Is document_id a primary key (or otherwise unique)? If so then
something like this might work:

DELETE FROM documents WHERE document_id NOT IN (
SELECT min(document_id)
FROM documents
GROUP BY customer_id, document_type_id
);

Be sure to understand what this query does before running it; I
might be making assumptions about your data that aren't correct.
I'd advise trying this or any other suggestion against test data
before using it on data you don't want to lose, and I'd also recommend
using a transaction that you can roll back if necessary (i.e., start
a transaction, run the delete, run some queries to make sure the
changes are correct, then either commit or roll back the transaction).

--
Michael Fuhr

In response to

Browse pgsql-novice by date

  From Date Subject
Next Message Sugrue, Sean 2006-01-11 20:30:05 Comparing databases
Previous Message Jaime Casanova 2006-01-11 20:05:37 Re: Removing duplicate entries