Re: Need help with a function from hell..

From: Andrew - Supernews <andrew+nonews(at)supernews(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Need help with a function from hell..
Date: 2006-10-04 02:35:26
Message-ID: slrnei67fe.1rvk.andrew+nonews@atlantis.supernews.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 2006-10-03, arsi(at)aranzo(dot)netg(dot)se <arsi(at)aranzo(dot)netg(dot)se> wrote:
> Hi all,
>
> I have a small coding problem where my function is becoming, well, too
> ugly for comfort. I haven't finished it but you will get picture below.
>
> First a small description of the purpose. I have an aggregate function
> that takes a string and simply concatenates that string to the previous
> (internal state) value of the aggregate, example:
>
> "Hello:World" || ", " || "World:Hello" --> "Hello:World, World:Hello"
>
> My problem is that I sometimes get the same value before the colon
> sign and in those cases I should not add the whole string to the previous
> value of the aggregate but extract the value that is behind the colon and
> add it to already existing part which matched the value before the colon
> but with a slash as a delimiter, example:
>
> Internal state: "Hello:World, World:Hello"
> New value: "Hello:Dolly"
> After function is run: "Hello:World/Dolly, World:Hello"
>
> So what I am doing is a lot of strpos() and substr() functions (I have
> previously asked for the speed of the substr() function) but it is
> beginning to look really alwful.

You might have better luck with a different approach. For example, start
by accumulating the values into an array, rather than a string, and rather
than try and do the magic bits in the transition function, do them in one
pass at the end, making use of the full power of SQL queries rather than
trying to do your own procedural logic.

If you think about your problem in SQL terms, what you're really trying to
do is essentially a "group by" on your first field. If you can avoid the
need to pass that in as a colon-delimited value, then your life will be
much simpler; but even if you can't avoid that, the SQLish solution will
be easier.

As a sample of what you can do, here is a function that does part of the
job (requires the array_accum aggregate as given as an example in the
manual):

create function foo(text[]) returns text[] language sql immutable
as $f$
select ARRAY(select k || ':' || array_to_string(v,'/')
from (select split_part($1[i],':',1) as k,
array_accum(substring($1[i] from ':(.*)')) as v
from generate_series(array_lower($1,1),
array_upper($1,1)) s(i)
group by k) s1)
$f$;

=> select array_to_string(foo(ARRAY['foo:bar', 'baz:quux', 'foo:baz']),',');
array_to_string
----------------------
baz:quux,foo:bar/baz
(1 row)

To understand the function, look at the subqueries from the inside out;
the inner one splits the foo:bar elements into two columns, groups by the
first and collects the corresponding values into an array; the outer one
converts the format back to the one you require.

As a bonus, if you want to eliminate duplicate values, you can just add
the "distinct" keyword inside the array_accum aggregate.

--
Andrew, Supernews
http://www.supernews.com - individual and corporate NNTP services

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Martijn van Oosterhout 2006-10-04 09:42:00 Re: &nbsp;&nbsp;Hi,&nbsp;ever
Previous Message kaspro 2006-10-04 01:16:10 storing transactions