Skip site navigation (1) Skip section navigation (2)

enumerate groups given a certain value

From: "Picavet Vincent" <Vincent(dot)Picavet(at)mediapost(dot)fr>
To: <pgsql-sql(at)postgresql(dot)org>
Subject: enumerate groups given a certain value
Date: 2008-08-07 17:31:16
Message-ID: EB18254270D1FD429047C987937D4A12031E6665@s92e07497.ad.mediapost.fr (view raw or flat)
Thread:
Lists: pgsql-sql
Hello,

Here is a sql problem, which I thought simple at first, but for which I
ended up with a solution I find surprisingly complicated.
I really think I could have achieved a much easier way of handling this,
but I do not manage to find the trick allowing a very simple and
efficient query to solve the problem.

Let's present it with a sample case. The initial table is the following
one :

--------------------
drop table if exists test_gen ;

create table 
	test_gen as 
select *
from (
	select 
		chr((round(random()* 25) +65)::int)  as id
		, random()* 100  as val
	from
		generate_series(1,200) as g
	order 
		by id
) as foo

select * from test_gen;
-------------------

What I want to do is to enumerate lines for each group of id, following
the order of val.

For example :
id	val			gen
A	2.65105138532817	1
A	38.9289360493422	2
A	74.6089164167643	3
B	2.01512188650668	1
B	11.4642047323287	2
B	31.2643219716847	3
B	65.8427979797125	4
C	0.759994331747293	1
C	11.8905796203762	2
C	13.7388648930937	3
C	49.1934351157397	4
C	83.1861903425306	5
D	45.8268967922777	1
D	57.1161589119583	2
E	9.72125697880983	1
E	61.324825277552	2
E	70.3348958399147	3
F	0.49891234234237	1


Here is the solution I ended up with :

---------------------------

-- first count number of ids per group
drop table test_gen2 ;

create table test_gen2 as 
select t1.*, t2.nb 
from 
	test_gen as t1,
	(
	SELECT 
		id, count(*) as nb 
	FROM 
		test_gen 
	GROUP BY id
	) as t2
WHERE 
	t1.id =t2.id
ORDER BY 
	t1.id;

create sequence seq_test_gen start with 1;
create sequence seq_test_gen2 start with 1;

-- get the table with the order set (gen is our order)
select
	*
from
	(
	select
		foo1.*,
		nextval('seq_test_gen') as serial
	from (
		select
			*
		from
			test_gen2
		order by
			id, val
		) as foo1
	) as t1,
	(
	select
		foo.*,
		nextval('seq_test_gen2') as serial
	from (
		select
			gb1.*, 
			generate_series(1, gb1.nb) as gen
		from (
			select
				id, nb
			from
				test_gen2
			group by
				id, nb
			) as gb1 
		order by 
			gb1.id, gen
		) as foo
	) as t2
where
	t1.serial = t2.serial
	;
-----------------------------------

The problem seems to be as easy as : <sort my two sets and put them side
to side>. But I could not find a better way to do that than putting a
serial on left and right side and do a join on this serial.
I also tried to find a solution using a modulo but could not manage to
get it work.

Anybody for a ray of light on a different approach ? This look like a
recurrent problem, isn't there an experienced sql programmer here who
tackled this issued a couple of time ?

Thanks for any help,
Vincent


Responses

pgsql-sql by date

Next:From: Steve MidgleyDate: 2008-08-07 17:40:22
Subject: Re: DELETE with JOIN
Previous:From: RagnarDate: 2008-08-07 17:05:38
Subject: Re: DELETE with JOIN

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group