Quick Links

SQL for removing duplicates?

From:	<kynn(at)panix(dot)com>
To:	pgsql-novice(at)postgresql(dot)org
Subject:	SQL for removing duplicates?
Date:	2006-06-13 16:11:41
Message-ID:	200606131611.k5DGBff21012@panix3.panix.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-novice

Hi. I'm stumped. I have a large table (about 8.5M records), let's
call it t, whose columns include x and y. I want to remove records
from this table so that any pair of values for these two fields appear
only once. (This will get rid of about 15% of the records in t.)

One simple solution would be something like

CREATE TABLE tmp AS SELECT DISTINCT ON ( x, y ) * FROM t;
DROP TABLE t;
ALTER TABLE tmp RENAME TO t;

This works, but it uses a lot of space. I would prefer to simply cull
the unwanted records from t, but I just can't figure out the SQL for
it. Any help with it would be *much* appreciated.

Thanks!

Responses

Re: SQL for removing duplicates? at 2006-06-13 18:26:04 from Brad Nicholson

Browse pgsql-novice by date

	From	Date	Subject
Next Message	Brad Nicholson	2006-06-13 18:26:04	Re: SQL for removing duplicates?
Previous Message	Christoph Frick	2006-06-13 15:29:13	order by desc - with zeros on top