Re: Manipulating complex types as non-contiguous structures in-memory

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Manipulating complex types as non-contiguous structures in-memory
Date: 2015-05-06 08:20:12
Message-ID: CAFj8pRD27hjKUPZsCwmFS0wXxPwKVEm5B24F2j9ihF4RDr=eyg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

2015-05-06 0:50 GMT+02:00 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:

> I wrote:
> > Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> writes:
> >> Significant slowdown is on following test:
>
> >> do $$ declare a int[] := '{}'; begin for i in 1..90000 loop a := a ||
> 10;
> >> end loop; end$$ language plpgsql;
> >> do $$ declare a numeric[] := '{}'; begin for i in 1..90000 loop a := a
> ||
> >> 10.1; end loop; end$$ language plpgsql;
>
> >> integer master 14sec x patched 55sec
> >> numeric master 43sec x patched 108sec
>
> >> It is probably worst case - and it is known plpgsql antipattern
>
> > Yeah, I have not expended a great deal of effort on the array_append/
> > array_prepend/array_cat code paths. Still, in these plpgsql cases,
> > we should in principle have gotten down from two array copies per loop to
> > one, so it's disappointing to not have better results there, even
> granting
> > that the new "copy" step is not just a byte-by-byte copy. Let me see if
> > there's anything simple to be done about that.
>
> The attached updated patch reduces both of those do-loop tests to about
> 60 msec on my machine. It contains two improvements over the 1.1 patch:
>
> 1. There's a fast path for copying an expanded array to another expanded
> array when the element type is pass-by-value: we can just memcpy the
> Datum array instead of working element-by-element. In isolation, that
> change made the patch a little faster than 9.4 on your int-array case,
> though of course it doesn't help for the numeric-array case (and I do not
> see a way to avoid working element-by-element for pass-by-ref cases).
>
> 2. pl/pgsql now detects cases like "a := a || x" and allows the array "a"
> to be passed as a read-write pointer to array_append, so that array_append
> can modify expanded arrays in-place and avoid inessential data copying
> altogether. (The earlier patch had made array_append and array_prepend
> safe for this usage, but there wasn't actually any way to invoke them
> with read-write pointers.) I had speculated about doing this in my
> earliest discussion of this patch, but there was no code for it before.
>
> The key question for change #2 is how do we identify what is a "safe"
> top-level function that can be trusted not to corrupt the read-write value
> if it fails partway through. I did not have a good answer before, and
> I still don't; what this version of the patch does is to hard-wire
> array_append and array_prepend as the functions considered safe.
> Obviously that is crying out for improvement, but we can leave that
> question for later; at least now we have infrastructure that makes it
> possible to do it.
>
> Change #1 is actually not relevant to these example cases, because we
> don't copy any arrays within the loop given change #2. But I left it in
> because it's not much code and it will help for situations where change #2
> doesn't apply.
>

I can confirm this speedup - pretty nice.

Multidimensional append is slower 2x .. but it is probably corner case

declare a int[] := '{}'; begin for i in 1..90000 loop a := a || ARRAY[[i
]]; end loop; raise notice '%', 'aa'; end$$ language plpgsql;

but this optimization doesn't work for code - that is semantically same
like a || i;

declare a int[] := '{}'; begin for i in 1..90000 loop a := a || ARRAY[i ];
end loop; raise notice '%', 'aa'; end$$ language plpgsql;

So there is some to much sensible

There are slowdown with MD arrays, but it is not typical use case, and the
speedup is about 5-10x and faster - so I'll be very happy if this patch
will be in 9.5

Regards

Pavel

>
> regards, tom lane
>
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Emre Hasegeli 2015-05-06 08:54:07 Re: BRIN range operator class
Previous Message Peter Eisentraut 2015-05-06 02:24:03 Re: [COMMITTERS] pgsql: Add transforms feature