Skip site navigation (1) Skip section navigation (2)

Re: [personal] Re: Filtering duplicated row with a trigger

From: Bruno Wolff III <bruno(at)wolff(dot)to>
To: papapep <papapep(at)gmx(dot)net>
Cc: pgsql-novice(at)postgresql(dot)org
Subject: Re: [personal] Re: Filtering duplicated row with a trigger
Date: 2003-10-06 18:01:41
Message-ID: 20031006180141.GA28877@wolff.to (view raw or flat)
Thread:
Lists: pgsql-novice
Please keep messages copied to the list.

On Mon, Oct 06, 2003 at 19:38:46 +0200,
  papapep <papapep(at)gmx(dot)net> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Bruno Wolff III wrote:
> | And what do want to happen when you run accross a duplicate row?
> | Do you just want to discard tuples with a duplicate primary key?
> 
> Initially I wanted to do so. But after I thought to "register" them in a
> dups table.

OK.

> 
> | If you are discarding duplicates, do you care which of the duplicates
> | is discarded?
> 
> That's why I said that perhaps it should be a good thing to keep them in
> a duplicated-rows table, for reviewing them.
> 
> | If you want to combine data from the duplicates, do you have a precise
> | description of what you want to happen?
> 
> No, I do not need to combine data from the duplicated. The entire row is
> accepted or not.

If you initially don't care which dup gets inserted, then the select DISTINCT
ON suggestion will work for loading the real table.
Something like:
insert into real_table select distinct on (pk1, pk2, pk3, pk4, pk5) * from
  temp_table;

To see the sets of duplicates you can do something like:
select * from temp_table, (select pk1, pk2, pk3, pk4, pk5 from temp_table
  group by pk1, pk2, pk3, pk4, pk5 having count(*) > 1) as dups where
  temp_table.pk1 = dups.pk1 and
  temp_table.pk2 = dups.pk2 and
  temp_table.pk3 = dups.pk3 and
  temp_table.pk4 = dups.pk4 and
  temp_table.pk5 = dups.pk5;

If there are a large number of records being imported, you probably want
to create an multicolumn index on pk1, pk2, pk3, pk4, and pk5 on the
temp table after loading it and before doing the query for duplicates.

In response to

pgsql-novice by date

Next:From: Partha SurDate: 2003-10-07 00:27:41
Subject: Re: Filtering duplicated row with a trigger
Previous:From: Josh BerkusDate: 2003-10-06 17:42:59
Subject: Re: [personal] Re: Filtering duplicated row with a trigger

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group