SQL for removing duplicates?

From: <kynn(at)panix(dot)com>
To: pgsql-novice(at)postgresql(dot)org
Subject: SQL for removing duplicates?
Date: 2006-06-13 16:11:41
Message-ID: 200606131611.k5DGBff21012@panix3.panix.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-novice

Hi. I'm stumped. I have a large table (about 8.5M records), let's
call it t, whose columns include x and y. I want to remove records
from this table so that any pair of values for these two fields appear
only once. (This will get rid of about 15% of the records in t.)

One simple solution would be something like

CREATE TABLE tmp AS SELECT DISTINCT ON ( x, y ) * FROM t;
DROP TABLE t;
ALTER TABLE tmp RENAME TO t;

This works, but it uses a lot of space. I would prefer to simply cull
the unwanted records from t, but I just can't figure out the SQL for
it. Any help with it would be *much* appreciated.

Thanks!

kj

Responses

Browse pgsql-novice by date

  From Date Subject
Next Message Brad Nicholson 2006-06-13 18:26:04 Re: SQL for removing duplicates?
Previous Message Christoph Frick 2006-06-13 15:29:13 order by desc - with zeros on top