Re: array/function question

From: Joshua Berry <yoberi(at)gmail(dot)com>
To: Nagy Zoltan <kirk(at)bteam(dot)hu>
Cc: postgresql Forums <pgsql-general(at)postgresql(dot)org>
Subject: Re: array/function question
Date: 2009-05-19 15:02:54
Message-ID: 262FE2BF-AD31-49F5-A4D1-EC56EF824DA9@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general


> you should use something similar to 'merge sort'
> but only if your input is sorted (m_bx expects this)

In my case, order is not guaranteed, and as the result needs to match
the order of the input, it seems that using some exhaustive tail
recursive method is the way to go. (By that I mean a loop within a
loop, testing up to m*n times where m and n are the length of the
arrays passed in.

> if your subjects (numbers) are not going beyond a certain limit
> eg(65535)
> take up an array and filter

For my application, there will likely be no more than 20 elements in
the array, so practical limits are not a problem.

> you can generate a poly for array B's roots, and calculate A's points
> -where it's 0, then the B array have the value ;)))
>
> writing the function in C is not so easy but it will be fast ;)

Can anyone point me to documentation on the performance differences
between plpgsql/plc/plperl/etc? I googled but only found a few
offhanded comments from mailing list archives and online message
boards. Are there any general guidelines on when it's a good idea to
switch to a language other than plsql or plpsql?

Here's my modified version of Nagy's function. This one allows
unsorted array elements, ordering the tests by the order of the
elements in the first array parameter.
Please forgive the lack of grace. I'd love tips on how to improve
this! In particular, is there a better way to find the length of an
array without having to piecewise handle the empty array case?

create or replace function m_bx(a integer[],b integer[])
returns boolean[]
AS
$BODY$
declare res boolean[];
declare i integer;
declare j integer;
declare la integer;
declare lb integer;
begin
i=1;
j=1;
-- array_upper returns NULL if the length of the array is 0, the
following hacks provided the desired result for empty array cases
-- la=array_upper(a,1);
la = (select CASE WHEN count is null THEN 0 ELSE count END from
(select array_upper(a::int[], 1) as count) as foo);
-- lb=array_upper(b,1);
lb = (select CASE WHEN count is null THEN 0 ELSE count END from
(select array_upper(b::int[], 1) as count) as foo);
loop
if i>la then
exit;
end if;
if (j>lb) then
res[i]=false;
j=1;
i=i+1;
else
if (a[i] = b[j]) then
--b contains this element, move to the next
res[i]=true;
j=1;
i=i+1;
else
j=j+1;
end if;
end if;
end loop;
return res;
end;
$BODY$
LANGUAGE 'plpgsql' IMMUTABLE
COST 100;

--Test cases to handle:
select m_bx('{1,2,5,4}','{5, 1, 4}'); --{t,f,t,t}
select m_bx('{1,2,5,4}','{5}'); --{f,f,t,f}
select m_bx('{1,2,5,4}','{}'); --{f,f,f,f}
select m_bx('{}'::int[],'{}'); --{}::bool

Regards,

Joshua Berry

On May 18, 2009, at 10:00 PM, Nagy Zoltan wrote:

>
>
> create or replace function m_bx(a integer[],b integer[])
> returns boolean[]
> as
> $BODY$
> declare res boolean[];
> declare i integer;
> declare j integer;
> declare la integer;
> declare lb integer;
> begin
> i=1;
> j=1;
> la=array_upper(a,1);
> lb=array_upper(b,1);
> loop
> if i>la then
> exit;
> end if;
> if (j<=lb and a[i] = b[j]) then
> res[i]=true;
> else
> res[i]=false;
> end if;
> if(b[j]<a[i]) then
> j=j+1;
> else
> i=i+1;
> end if;
> end loop;
>
> return res;
> end;
> $BODY$
> LANGUAGE 'plpgsql' IMMUTABLE
> COST 100;
>
> select m_bx('{1,2,4,5}','{1,5,6}');
>
>
> Joshua Berry wrote:
>> Hello All,
>>
>> I'm trying to optimize a few slow queries and helper functions, and
>> have
>> found a poor performing function. To improve performance, I'd like to
>> create a function that does the following:
>>
>>
>> Inputs:
>> A: an array of integers. for example: { 1, 2, 3, 4, 7 }
>> B: an array of integers. for example: { 1, 4, 8, 9 }
>>
>> Returns
>> C: an array of bools the same dimensions as Array A. In this
>> example: {
>> true, false, false, false, true, false }
>>
>> Effectively, this function would use Array A as a set of boolean
>> tests
>> to exercise on Array B. The result array will have the save number of
>> elements as array A.
>>
>> What I lack is the knowledge of how to
>> 1. index and compare arrays when their input size is not known. (I
>> only
>> know how to use hardcoded indexes like A[1], B[2], etc.
>> 2. To use control structures for recursion/looping. I've read
>> http://www.postgresql.org/docs/8.3/interactive/plpgsql-control-structures.html
>> but
>> still not sure how to apply the grammar to arrays data types.
>>
>> If there is a builtin array function that achieves this, that would
>> be
>> good to know as well.
>>
>> Cheers,
>>
>> -Joshua
>>
>> Joshua Berry
>>
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Sebastien FLAESCH 2009-05-19 15:25:32 Re: INTERVAL data type and libpq - what format?
Previous Message Bayless Kirtley 2009-05-19 14:40:08 Re: Daylight saving time question