Re: Parallelized polymorphic aggs, and aggtype vs aggoutputtype

From: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Parallelized polymorphic aggs, and aggtype vs aggoutputtype
Date: 2016-06-20 10:19:03
Message-ID: CAKJS1f-Eg1Kk69gdnkUGdSEE_25DcyRXUdTNNdyHqxPWp00rSw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 20 June 2016 at 19:06, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com> wrote:
> On 18 June 2016 at 05:45, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> A possible solution is to give deserialize an extra dummy argument, along
>> the lines of "deserialize(bytea, internal) returns internal", thereby
>> ensuring it can't be called in any non-system-originated contexts. This
>> is still rather dangerous if the other argument is variable, as somebody
>> might be able to abuse an internal-taking function by naming it as the
>> deserialize function for a maliciously-designed aggregate. What I'm
>> inclined to do to lock it down further is to drop the "serialtype"
>> argument to CREATE AGGREGATE, which seems rather pointless (what else
>> would you ever use besides bytea?). Instead, insist that
>> serialize/deserialize apply *only* when the transtype is INTERNAL, and
>> their signatures are exactly "serialize(internal) returns bytea" and
>> "deserialize(bytea, internal) returns internal", never anything else.
>
> This is also the only way that I can think of to fix this issue. If we
> can agree that the fix should be to insist that the deserialisation
> function take an additional 2nd parameter of INTERNAL, then I can
> write a patch to fix this, and include a patch for the document
> section 35.10 to explain better about parallelising user defined
> aggregates.

I've gone and implemented the dummy argument approach for
deserialization functions.

If we go with this, I can then write the docs for 35.10 which'll serve
to explain parallel user defined aggregates in detail.

Some notes about the patch;

I didn't remove the comments at the top of each deserial function
which mention something like:

* numeric_avg_serialize(numeric_avg_deserialize(bytea)) must result in a value
* which matches the original bytea value.

I'm thinking that perhaps these now make a little less sense, given
that numeric_avg_deserialize is now numeric_avg_deserialize(bytea,
internal).

Perhaps these should be updated or removed.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
deserialization_function_fix.patch application/octet-stream 9.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Asif Naeem 2016-06-20 11:28:38 Re: Truncating/vacuuming relations on full tablespaces
Previous Message Heikki Linnakangas 2016-06-20 09:01:37 Re: Should XLogInsert() be done only inside a critical section?