Re: Parallelized polymorphic aggs, and aggtype vs aggoutputtype

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Parallelized polymorphic aggs, and aggtype vs aggoutputtype
Date: 2016-06-17 19:14:34
Message-ID: 30448.1466190874@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Fri, Jun 17, 2016 at 1:45 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> there's a far bigger problem which is that "deserialize(bytea) returns
>> internal" is a blatant security hole, which I see you ignored the defense
>> against in opr_sanity.sql. The claim in the comment there that it's okay
>> if we check for aggregate context is a joke --- or haven't you heard that
>> we allow users to create aggregates? That's not nearly good enough to
>> prevent unsafe usage of such functions.

> That may be true, but it isn't obvious to me. The user can't simply
> say SELECT numeric_avg_deserialize(...) or whatever because the
> function won't execute when called that way. How would you arrange to
> call such a function in a context where it wouldn't fail outright but
> would return a value that you could then hand to some other function
> that expects a type-internal input?

The concern I have is that you could stick it into an aggregate that isn't
the one it's expecting to be used in, or into a slot in that aggregate
that isn't deserialize(), and the run-time test can't detect either of
those things. Now maybe this is locked down sufficiently by the fact
that we don't let non-superusers create aggregates with transtype
INTERNAL, but that seems pretty shaky to me given the number of moving
parts in aggregates these days and the fact that we keep adding more.
But whether there's a live security bug there today is really moot,
because of:

>> Not to mention that CREATE
>> FUNCTION won't allow creation of such functions, so extensions are locked
>> out of using this feature.

> Oops.

I think that means we *have* to change this.

>> A possible solution is to give deserialize an extra dummy argument, along
>> the lines of "deserialize(bytea, internal) returns internal", thereby
>> ensuring it can't be called in any non-system-originated contexts.

> Seems promising.

If nobody has a better idea, I'll pursue that next week or so. It doesn't
seem like a must-fix-for-beta2. (Letting it slide means we're locked into
an initdb or pg_upgrade again for beta3, but I don't have a lot of hope of
avoiding that anyway.)

Meanwhile, I'll go see if I can work out exactly what's broken about the
polymorphic type-slinging. That I would like to see working in beta2.

> I originally had the thought that you might want to make the
> serialized representation something that could be human-readable.

Perhaps, but on efficiency grounds I doubt that people would want that
to be the same API that's used for partial aggregation. I could see
adding an additional function slot "prettyprint(internal) returns text"
to aggregates for this purpose. In any case, even if we lock down the
declared type to be bytea, there's nothing stopping someone from defining
their serialization code to output an ASCII string.

> I think we should break up internal into various kinds of internal
> depending on what kind of a thing we've actually got a pointer to.

Not a bad long-term project, but it's not happening in this cycle.
I'm not very sure how we'd go about it anyway --- for examples
like this, every new user-defined aggregate potentially wants its
own flavor of "internal", so how do we manage that?

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2016-06-17 19:26:45 Re: Parallelized polymorphic aggs, and aggtype vs aggoutputtype
Previous Message Robert Haas 2016-06-17 19:11:32 Re: Parallel safety tagging of extension functions