Re: unnest and string_to_array on two columns

From: "David Johnston" <polobo(at)yahoo(dot)com>
To: "'Michael Graham'" <mgraham(at)bloxx(dot)com>, <pgsql-general(at)postgresql(dot)org>
Subject: Re: unnest and string_to_array on two columns
Date: 2011-10-25 13:19:55
Message-ID: 009501cc9318$ca568df0$5f03a9d0$@yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

-----Original Message-----
From: pgsql-general-owner(at)postgresql(dot)org [mailto:pgsql-general-owner(at)postgresql(dot)org] On Behalf Of Michael Graham
Sent: Tuesday, October 25, 2011 4:36 AM
To: pgsql-general(at)postgresql(dot)org
Subject: [GENERAL] unnest and string_to_array on two columns

Hi all,

I'm trying to migrate an old (and sucky) schema to a new one and I'm having some difficulties coming up with a sane select.

I have basically id, a, and b where a and b contain a list of flags like

id | a | b |
--------------------
1 | abc | abcdef |

and what to convert this to multiple ids with single flags, like:

id | a | b |
------------------
1 | a | a |
1 | b | b |
1 | c | c |
1 | NULL | d |
1 | NULL | e |
1 | NULL | f |

My first attempt was

SELECT id, unnest(string_to_array(a,NULL)),
unnest(string_to_array(b,NULL)) FROM foo;

But this causes the shorter string to be repeated until it is the same length as the shorter string. In the end I have managed to get the behaviour that I want but the select is horrible:

SELECT COALESCE(aa.id,bb.id) AS id,
aa.unnest AS aaaaa,
bb.unnest AS bbbbb FROM
(
SELECT *, row_number() OVER() FROM
(
SELECT id,unnest(string_to_array(a,NULL)) FROM foo
) AS a
) AS aa
FULL JOIN
(
SELECT *, row_number() OVER() FROM
(
SELECT id,unnest(string_to_array(b,NULL)) FROM foo
) AS b
) AS bb
ON aa.row_number=bb.row_number AND aa.id=bb.id;

So I was wondering if anyone had any better solutions.

Thanks,
--
Michael Graham <mgraham(at)bloxx(dot)com>

----------------- /Original Message -----------------
Same solution but using CTEs.

WITH val_src AS (
SELECT * FROM (VALUES (1, 'a,b,c','a,b,c,d,e')) vals (id, a, b)
), a_expanded AS (
SELECT *, ROW_NUMBER() OVER () AS row_index FROM (
SELECT id, unnest(string_to_array(a, ',')) AS a_item FROM val_src
) a_src
), b_expanded aS (
SELECT *, ROW_NUMBER() OVER () AS row_index FROM (
SELECT id, unnest(string_to_array(b, ',')) AS b_item FROM val_src
) b_src
)
SELECT *
FROM b_expanded NATURAL FULL OUTER JOIN a_expanded;

The only other thought would be to limit your first query to [pseudocode] "WHERE ROW_NUMBER() OVER () <= MIN(b.length, a.length)" and the UNION the remainder of A and B where the row number is "> MIN(b.length, a.length)". This seems worse though.

David J.

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Marti Raudsepp 2011-10-25 15:04:20 Re: strange java query behaviour
Previous Message Merlin Moncure 2011-10-25 13:09:46 Re: Primary key Index Error