Quick Links

Re: how to group by similarity ?

From:	hari(dot)fuchs(at)gmail(dot)com
To:	pgsql-general(at)postgresql(dot)org
Subject:	Re: how to group by similarity ?
Date:	2012-04-25 08:34:18
Message-ID:	87k414jjlh.fsf@hf.protecting.net
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

Andreas <maps(dot)on(at)gmx(dot)net> writes:

> How would I group the table so that it shows groups that have
> similarity () > x ?
>
> Lets say the table looks like this:
>
> id, txt
> 1, aa1
> 2, bb1
> 3, cc1
> 4, bb2
> 5, bb3
> 6, aa2
> ...
>
> How would a select look like that shows:
>
> id, txt, group_id
> 1, aa1, 1,
> 6, aa2, 1,
> 2, bb1, 2,
> 4, bb2, 2,
> 5, bb3, 2,
> 3, cc1, 3

The following query will do that, but it's convoluted:

WITH grp (t1, id, t2) AS (
SELECT t1.txt, t1.id, t2.txt
FROM tbt t1
LEFT JOIN tbt t2 ON t2.txt > t1.txt
WHERE t2.txt IS NULL OR similarity(t1.txt, t2.txt) > 0
)
SELECT t1, min(id)
FROM (
SELECT t1, id
FROM grp
UNION ALL
SELECT t2, id
FROM grp
WHERE t2 IS NOT NULL
) dummy
GROUP BY t1
ORDER BY t1

In response to

how to group by similarity ? at 2012-04-24 18:07:34 from Andreas

Responses

Re: how to group by similarity ? at 2012-09-09 09:38:35 from Tim Uckun

Browse pgsql-general by date

	From	Date	Subject
Next Message	Willy-Bas Loos	2012-04-25 08:40:40	Re: how robust are custom dumps?
Previous Message	hari.fuchs	2012-04-25 08:31:17	Re: how to group by similarity ?