Re: REVIEW: WIP: plpgsql - foreach in

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Itagaki Takahiro <itagaki(dot)takahiro(at)gmail(dot)com>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: REVIEW: WIP: plpgsql - foreach in
Date: 2011-01-26 07:09:47
Message-ID: AANLkTin4_y2e3j4U8+BOXZVmEGU-M6uNxzuefzkjq+Hi@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

2011/1/26 Itagaki Takahiro <itagaki(dot)takahiro(at)gmail(dot)com>:
> On Mon, Jan 24, 2011 at 20:10, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
>> we have to iterate over array's items because it allow seq. access to
>> array's data. I need a global index for function "array_get_isnull". I
>> can't to use a buildin functions like array_slize_size or
>> array_get_slice, because there is high overhead of array_seek
>> function. I redesigned the initial part of main cycle, but code is
>> little bit longer :(, but I hope, it is more readable.
>
> The attached patch only includes changes in plpgsql. I tested it
> combined with the previous one, that includes other directories.
>
> * src/pl/plpgsql/src/gram.y
> | for_variable K_SLICE ICONST
> The parameter for SLICE must be an integer literal. Is it a design?
> I got a syntax error when I wrote a function like below. However,
> FOREACH returns different types on SLICE = 0 or SLICE >= 1.
> (element type for 0, or array type for >=1)  So, we cannot write
> a true generic function that accepts all SLICE values even if
> the syntax supports an expr for the parameter.

Yes, it is design. You wrote a reason. A parametrized SLICE needs only
a few lines more, but a) I really don't know a use case, b) there is
significant difference between slice 0 and slice 1

>
> CREATE FUNCTION foreach_test(int[], int) RETURNS SETOF int[] AS
> $$
> DECLARE
>  a integer[];
> BEGIN
>  FOREACH a SLICE $2 IN $1 LOOP -- syntax error
>    RETURN NEXT a;
>  END LOOP;
> END;
> $$ LANGUAGE plpgsql;
>
> * /doc/src/sgml/plpgsql.sgml
> + FOREACH <replaceable>target</replaceable> <optional> SCALE
> s/SCALE/SLICE/ ?
>
> * Why do you use "foreach_a" for some identifiers?
> We use "foreach" only for arrays, so "_a" suffix doesn't seem needed.

yes, it isn't needed. But FOREACH is a new construct, that can be used
for more different iterations - maybe iterations over hstore,
element's set, ...

so _a means - this is implementation for arrays.

>
> * accumArrayResult() for SLICE has a bit overhead.
> If we want to optimize it, we could use memcpy() because slices are
> placed in continuous memory. But I'm not sure the worth; I guess
> FOREACH will be used with SLICE = 0 in many cases.
>

I agree with you, so SLICE > 0 will not be used often.

accumArrayResult isn't expensive function - for non varlena types. The
cutting of some array needs a redundant code to CopyArrayEls and
construct_md_array. All core functions are not good for our purpose,
because doesn't expect a seq array reading :(. Next - simply copy can
be done only for arrays without null bitmap, else you have to do some
bitmap rotations :(. So, I don't would do this optimization in this
moment. It has sense for multidimensional numeric arrays, but can be
solved in next version. It's last commitfest now, and I don't do this
patch more complex then it is now.

Regards

Pavel Stehule

> --
> Itagaki Takahiro
>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2011-01-26 07:20:39 Re: SSI patch version 14
Previous Message Itagaki Takahiro 2011-01-26 04:53:10 Re: REVIEW: WIP: plpgsql - foreach in