Quick Links

Re: A DISTINCT problem removing duplicates

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Richard Huxton <dev(at)archonet(dot)com>
Cc:	PostgreSQL SQL List <pgsql-sql(at)postgresql(dot)org>
Subject:	Re: A DISTINCT problem removing duplicates
Date:	2008-12-09 15:39:29
Message-ID:	15516.1228837169@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-sql

Richard Huxton <dev(at)archonet(dot)com> writes:
> Tom Lane wrote:
>> Richard Huxton <dev(at)archonet(dot)com> writes:
>>> Anyone got anything more elegant?
>>
>> Seems to me that no document should have an empty dup_set. If it's not
>> a match to any existing document, then immediately assign a new dup_set
>> number to it.

> That was my initial thought too, but it means when I actually find a
> duplicate I have to decide which "direction" to renumber them in.

Hmm, so you mean you might decide that two docs are duplicates sometime
after initially putting them both in the database? Seems like you have
issues with that anyway. If you already know A,B are dups and
separately that C,D are dups, and you later decide B and C are dups,
what do you do?

regards, tom lane

In response to

Re: A DISTINCT problem removing duplicates at 2008-12-09 15:04:48 from Richard Huxton

Responses

Re: A DISTINCT problem removing duplicates at 2008-12-09 16:12:47 from Richard Huxton

Browse pgsql-sql by date

	From	Date	Subject
Next Message	Richard Huxton	2008-12-09 16:12:47	Re: A DISTINCT problem removing duplicates
Previous Message	Richard Huxton	2008-12-09 15:04:48	Re: A DISTINCT problem removing duplicates