Re: Manipulating complex types as non-contiguous structures in-memory

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgreSQL(dot)org
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Martijn van Oosterhout <kleptog(at)svana(dot)org>
Subject: Re: Manipulating complex types as non-contiguous structures in-memory
Date: 2015-02-15 23:41:47
Message-ID: 1960.1424043707@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Here's an updated version of the patch I sent before. Notable changes:

* I switched over to calling "deserialized" objects "expanded" objects,
and the default representation is now called "flat" or "flattened" instead
of "reserialized". Per suggestion from Robert.

* I got rid of the bit about detecting read-write pointers by address
comparison. Instead there are now two vartag values for R/W and R/O
pointers. After further reflection I concluded that my previous worry
about wanting copied pointers to automatically become read-only was
probably wrong, so there's no need for extra confusion here.

* I added support for extracting array elements from expanded values
(array_ref).

* I hacked plpgsql to force values of array-type variables into expanded
form; this is needed to get any win from the array_ref change if the
function doesn't do any assignments to elements of the array. This is an
improvement over the original patch, which hardwired array_set to force
expansion, but I remain unsatisfied with it as a long-term answer. It's
not clear that it's always a win to do this (but the tradeoff will change
as we convert more array support functions to handle expanded inputs, so
it's probably not worth getting too excited about that aspect of it yet).
A bigger complaint is that this approach cannot fix things for non-builtin
types such as hstore. I'm hesitant to add a pg_type column carrying an
expansion function OID, but there may be no other workable answer for
extension types.

The patch as it stands is able to do nice things with

create or replace function arraysetnum(n int) returns numeric[] as $$
declare res numeric[] := '{}';
begin
for i in 1 .. n loop
res[i] := i;
end loop;
return res;
end
$$ language plpgsql strict;

create or replace function arraysumnum(arr numeric[]) returns numeric as $$
declare res numeric := 0;
begin
for i in array_lower(arr, 1) .. array_upper(arr, 1) loop
res := res + arr[i];
end loop;
return res;
end
$$ language plpgsql strict;

regression=# select arraysumnum(arraysetnum(100000));
arraysumnum
-------------
5000050000
(1 row)

Time: 304.336 ms

(versus approximately 1 minute in 9.4, although these numbers are for
cassert builds so should be taken with a grain of salt.) There are
still a couple more flattening/expansion conversions than I'd like,
in particular the array returned by arraysetnum() gets flattened on its
way out, which would be good to avoid.

I'm going to stick this into the commitfest even though it's not really
close to being committable; I see some other people doing likewise with
their pet patches ;-). What it could particularly do with some reviewing
help on is exploring the performance changes it creates; what cases does
it make substantially worse?

regards, tom lane

Attachment Content-Type Size
expanded-arrays-0.2.patch text/x-diff 83.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2015-02-15 23:47:41 Re: Manipulating complex types as non-contiguous structures in-memory
Previous Message David Steele 2015-02-15 23:31:26 Issue installing doc tools on OSX