Quick Links

Re: A DISTINCT problem removing duplicates

From:	Richard Huxton <dev(at)archonet(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	PostgreSQL SQL List <pgsql-sql(at)postgresql(dot)org>
Subject:	Re: A DISTINCT problem removing duplicates
Date:	2008-12-09 15:04:48
Message-ID:	493E8910.7010007@archonet.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-sql

Tom Lane wrote:
> Richard Huxton <dev(at)archonet(dot)com> writes:
>> Anyone got anything more elegant?
>
> Seems to me that no document should have an empty dup_set. If it's not
> a match to any existing document, then immediately assign a new dup_set
> number to it.

That was my initial thought too, but it means when I actually find a
duplicate I have to decide which "direction" to renumber them in. It
also means probably keeping a summary table with counts to show which
are duplicates, since the duplicates table is now the same size as the
documents table.

--
Richard Huxton
Archonet Ltd

In response to

Re: A DISTINCT problem removing duplicates at 2008-12-09 13:50:47 from Tom Lane

Responses

Re: A DISTINCT problem removing duplicates at 2008-12-09 15:39:29 from Tom Lane

Browse pgsql-sql by date

	From	Date	Subject
Next Message	Tom Lane	2008-12-09 15:39:29	Re: A DISTINCT problem removing duplicates
Previous Message	Tom Lane	2008-12-09 13:50:47	Re: A DISTINCT problem removing duplicates