Re: Concatenate performance question

From: "Michael Guyver" <kenevel(at)googlemail(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Concatenate performance question
Date: 2006-12-03 13:52:06
Message-ID: 30b57570612030552w34682fbbta948394b9e3673f5@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Gents,

At risk of answering my own question to spur someone actually to share
their thoughts on this topic, I thought I'd provide a quick look at
the performance of the alternatives: either using the || operator, or
the array_append method.

-- SELECT * FROM test_v_01();
-- SELECT * FROM test_v_02();

CREATE OR REPLACE FUNCTION test_v_01() RETURNS VARCHAR AS $$
DECLARE
buffer varchar;
i int4;
BEGIN
buffer := 'the quick brown fox jumps over the lazy dog';
FOR i IN 1..1000 LOOP
buffer := buffer || 'the quick brown fox jumps over the lazy dog';
END LOOP;
RETURN buffer;
END;
$$
LANGUAGE plpgsql;

CREATE OR REPLACE FUNCTION test_v_02() RETURNS VARCHAR AS $$
DECLARE
buffer varchar[] := '{}';
BEGIN
buffer := array_append(buffer,'the quick brown fox jumps over the lazy dog');
FOR i IN 1..1000 LOOP
buffer := array_append(buffer, 'the quick brown fox jumps over the lazy dog');
END LOOP;
RETURN array_to_string(buffer,'');
END;
$$
LANGUAGE plpgsql;

Running the array_append version is faster by at least one order of
magnitude in these examples. However, where you can in-line the ||
operator with multiple operands, ie

buffer := buffer || 'token 1' || results.user_id::text || 'token 2' ||
results.event_id::text || 'token3';

it is faster than calling

buffer := array_append(buffer, 'token 1');
buffer := array_append(buffer, results.user_id::text);
buffer := array_append(buffer, 'token 2');
buffer := array_append(buffer, results.event_id::text);
buffer := array_append(buffer, 'token 3');

This seems entirely reasonable, as the latter requires the evaluation
of five calls, whereas the former can do it in one go.

However, my original question still stands - is there another way of
doing this? Is it possible to write to a bytea or blob or stream and
avoid having to do any concatenation at all?

Cheers

Michael

On 29/11/06, Michael Guyver <kenevel(at)googlemail(dot)com> wrote:
> Hi there,
>
> I've got a rather large PL/pgSQL function which returns a varchar
> (though it could be text, char or blob, I'm not fussy) containing JSON
> information (where JSON is Javascript Object Notation). The middle
> tier of the app does pretty much sweet FA except pass this straight
> back to the client. I'm interested in seeing how much faster I can get
> the app to process a request this way as opposed to retrieving the
> data over three or four calls to the DB before constructing the JSON
> response in the middle tier.
>
> I've got to the point where I suspect the concatenation could do with
> some attention. What is the fastest way of returning this to the
> client?
>
> I thought that storing the individual segments of text in an array and
> stitiching it all together at the end of the function may be a fast
> way of doing things, using an
>
> array_to_string(resultArray,'');
>
> call. However I have my doubts whether the
>
> resultArray := array_append(resultArray,'next token');
>
> is performant as it looks as though it's constructing a new array from
> the argument each time its called. Can someone confirm or rebut this?
>
> How would a simple
>
> result := result || 'next token';
>
> perform? The result size is in the 20-25 Kb range.
>
> A mate mentioned that the way Oracle's OWS does it is to stream the
> response back as a blob. I presume he means that the function could
> iterate over the different queries' result-sets and simply write the
> results to the blob before returning. Can anyone shed any light on
> this approach and its applicabilty to PostgreSQL?
>
> Cheers
>
> Michael
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message tam wei 2006-12-03 16:28:52 Storing files in postgres db
Previous Message vivek 2006-12-03 10:53:30 Re: select query not using index