Re: Postgres array parser

From: "Marc Mamin" <M(dot)Mamin(at)intershop(dot)de>
To: "Pavel Stehule" <pavel(dot)stehule(at)gmail(dot)com>, "Aleksej Trofimov" <aleksej(dot)trofimov(at)ruptela(dot)lt>
Cc: <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Postgres array parser
Date: 2011-12-14 09:21:56
Message-ID: C4DAC901169B624F933534A26ED7DF310861B23A@JENMAIL01.ad.intershop.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hello,

For such cases (see below), it would be nice to have an unnest function that only affect the first array dimension.

Something like

unnest(ARRAY[[1,2],[2,3]], SLICE=1)
=>
unnest
------
[1,2]
[2,3]

With this function, I imagine that following sql function
might beat the plpgsql FOREACH version.

CREATE OR REPLACE FUNCTION input_value_un (in_inputs numeric[], in_input_nr numeric)
RETURNS numeric AS
$BODY$

SELECT u[1][2]
FROM unnest($1, SLICE =1) u
WHERE u[1][1]=in_input_nr
LIMIT 1;

$BODY$
LANGUAGE sql IMMUTABLE;



best regards,

Marc Mamin

> -----Original Message-----
> From: pgsql-performance-owner(at)postgresql(dot)org [mailto:pgsql-performance-
> owner(at)postgresql(dot)org] On Behalf Of Pavel Stehule
> Sent: Dienstag, 13. Dezember 2011 15:43
> To: Aleksej Trofimov
> Cc: pgsql-performance(at)postgresql(dot)org
> Subject: Re: [PERFORM] Postgres array parser
>
> Hello
>
> 2011/12/13 Aleksej Trofimov <aleksej(dot)trofimov(at)ruptela(dot)lt>:
> > We have tried foreach syntax, but we have noticed performance
> degradation:
> > Function with for: 203ms
> > Function with foreach: ~250ms:
> >
> > there is functions code:
> > CREATE OR REPLACE FUNCTION input_value_fe(in_inputs numeric[],
> in_input_nr
> > numeric)
> >  RETURNS numeric AS
> > $BODY$
> > declare i numeric[];
> > BEGIN
> >        FOREACH i SLICE 1 IN ARRAY in_inputs
> >            LOOP
> >                 if i[1] = in_input_nr then
> >                    return i[2];
> >                 end if;
> >            END LOOP;
> >
> >    return null;
> > END;
> > $BODY$
> >  LANGUAGE plpgsql VOLATILE
> >  COST 100;
> >
> > CREATE OR REPLACE FUNCTION input_value(in_inputs numeric[],
> in_input_nr
> > numeric)
> >  RETURNS numeric AS
> > $BODY$
> > declare
> >  size int;
> > BEGIN
> >  size = array_upper(in_inputs, 1);
> >    IF size IS NOT NULL THEN
> >
> >        FOR i IN 1 .. size LOOP
> >            if in_inputs[i][1] = in_input_nr then
> >                return in_inputs[i][2];
> >            end if;
> >        END LOOP;
> >    END IF;
> >
> >    return null;
> > END;
> > $BODY$
> >  LANGUAGE plpgsql VOLATILE
> >  COST 100;
> >
> >
> > On 12/13/2011 04:02 PM, Pavel Stehule wrote:
> >>
> >> Hello
> >>
> >> do you know FOREACH IN ARRAY statement in 9.1
> >>
> >> this significantly accelerate iteration over array
> >>
> >>
> >> http://www.depesz.com/index.php/2011/03/07/waiting-for-9-1-foreach-
> in-array/
> >>
> >>
> >>
> >> 2011/12/13 Aleksej Trofimov<aleksej(dot)trofimov(at)ruptela(dot)lt>:
> >>>
> >>> Hello, I wanted to ask according such a problem which we had faced
> with.
> >>> We are widely using postgres arrays like key->value array by doing
> like
> >>> this:
> >>>
> >>> {{1,5},{2,6},{3,7}}
> >>>
> >>> where 1,2,3 are keys, and 5,6,7 are values. In our pgSql functions
> we are
> >>> using self written array_input(array::numeric[], key::numeric)
> function
> >>> which makes a loop on whole array and searches for key like
> >>> FOR i IN 1 .. size LOOP
> >>>            if array[i][1] = key then
> >>>                return array[i][2];
> >>>            end if;
> >>> END LOOP;
> >>>
> >>> But this was a good solution until our arrays and database had
> grown. So
> >>> now
> >>> FOR loop takes a lot of time to find value of an array.
> >>>
> >>> And my question is, how this problem of performance could be
> solved? We
> >>> had
> >>> tried pgperl for string parsing, but it takes much more time than
> our
> >>> current solution. Also we are thinking about self-written C++
> function,
> >>> may
> >>> be someone had implemented this algorithm before?
> >>>
> >> you can use indexes or you can use hstore
> >>
> >> Regards
> >>
> >> Pavel Stehule
> >>
> >>> --
> >>> Best regards
> >>>
> >>> Aleksej Trofimov
> >>>
> >>>
> >>> --
> >>> Sent via pgsql-performance mailing list
> >>> (pgsql-performance(at)postgresql(dot)org)
> >>> To make changes to your subscription:
> >>> http://www.postgresql.org/mailpref/pgsql-performance
> >
> >
>
> It is strange - on my comp FOREACH is about 2x faster
>
> postgres=# select input_value(array(select
> generate_series(1,1000000)::numeric), 100000);
> input_value
> -------------
>
> (1 row)
>
> Time: 495.426 ms
>
> postgres=# select input_value_fe(array(select
> generate_series(1,1000000)::numeric), 100000);
> input_value_fe
> ----------------
>
> (1 row)
>
> Time: 248.980 ms
>
> Regards
>
> Pavel
>
>
> >
> > --
> > Best regards
> >
> > Aleksej Trofimov
> >
> > UAB "Ruptela"
> >
> > Phone: +370 657 80475
> >
> > E-Mail: aleksej(dot)trofimov(at)ruptela(dot)lt
> > Web:    http://www.ruptela.lt
> >
> > Ruptela - the most successful IT company in Lithuania 2011
> > Ruptela - sekmingiausia Lietuvos aukštųjų technologijų įmonė 2011
> >
> http://www.prezidentas.lt/lt/spaudos_centras_392/pranesimai_spaudai/ino
> vatyvus_verslas_-_konkurencingos_lietuvos_pagrindas.html
> > http://www.ruptela.lt/news/37/121/Ruptela-sekmingiausia-jauna-
> aukstuju-technologiju-imone-Lietuvoje
> >
>
> --
> Sent via pgsql-performance mailing list (pgsql-
> performance(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Aleksej Trofimov 2011-12-14 09:59:34 Re: Postgres array parser
Previous Message idc danny 2011-12-14 07:02:10 Re: copy vs. C function