Re: Domains and arrays and composites, oh my

From: Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Domains and arrays and composites, oh my
Date: 2017-09-28 15:33:36
Message-ID: aaaa73ea-4d5f-ed8c-7222-06a2d5f2891e@2ndQuadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 07/13/2017 03:19 PM, Tom Lane wrote:
> I wrote:
>> I started to look into allowing domains over composite types, which is
>> another never-implemented case that there's no very good reason not to
>> allow. Well, other than the argument that the SQL standard only allows
>> domains over "predefined" (built-in) types ... but we blew past that
>> restriction ages ago.
> Attached is a draft patch that allows domains over composite types.
> I think it's probably complete on its own terms, but there are some
> questions around behavior of functions returning domain-over-composite
> that could use discussion, and some of the PLs need some follow-on work.
>
> The core principle here was to not modify execution-time behavior by
> adding domain checks to existing code paths. Rather, I wanted the
> parser to insert CoerceToDomain nodes wherever checks would be needed.
> Row-returning node types such as RowExpr and FieldStore only return
> regular composite types, as before; to generate a value of a composite
> domain, we construct a value of the base type and then CoerceToDomain.
> (This is pretty analogous to what happens for domains over arrays.)
> Whole-row Vars can only have regular composite types as vartype,
> TupleDescs can only have regular composite types as tdtypeid, composite
> Datums only have regular composite type OIDs in datum_typeid. (The last
> would be expected anyway, since the physical representation of a domain
> value is never different from that of its base type.)
>
> Doing it that way led to a nicely small patch, only about 160 net new
> lines of code. However, not touching the executor meant not touching
> the behavior of functions that return domains, even if the domain is
> domain-over-composite. In code terms this means that
> get_call_result_type() and sibling functions will return TYPEFUNC_SCALAR
> not TYPEFUNC_COMPOSITE for such a result type. The difficulty with
> changing that is that if these functions look through the domain, then
> the calling code (in, usually, a PL) will simply build and return a result
> of the underlying composite type, failing to apply any domain constraints.
> Trying to get out-of-core PLs on board with a change in those requirements
> seems like a risky proposition.
>
> Concretely, consider
>
> create type complex as (r float8, i float8);
> create domain dcomplex as complex;
>
> You can make a SQL-language function to return complex in either of two
> ways:
>
> create function fc() returns complex language sql
> as 'select 1.0::float8, 2.0::float8';
>
> create function fc() returns complex language sql
> as 'select row(1.0::float8, 2.0::float8)::complex';
>
> As the patch stands, though, only the second way works for domains over
> composite:
>
> regression=# create function fdc() returns dcomplex language sql
> as 'select 1.0::float8, 2.0::float8';
> ERROR: return type mismatch in function declared to return dcomplex
> DETAIL: Final statement must return exactly one column.
> CONTEXT: SQL function "fdc"
> regression=# create function fdc() returns dcomplex language sql
> as 'select row(1.0::float8, 2.0)::dcomplex';
> CREATE FUNCTION
>
> Now, maybe that's fine. SQL-language functions have never been very
> willing to insert implicit casts to get to the function result type,
> and certainly the only way that the first definition could be legal
> is if there were an implicit up-cast to the domain type. It might be
> OK to just leave it like this, though some documentation about it
> would be a good idea.
>
> plpgsql functions seem generally okay as far as composite domain return
> types go, because they don't have anything corresponding to the row
> return convention of SQL functions. And plpgsql's greater willingness
> to do implicit coercions reduces the notational burden, too. But
> there's some work yet to be done to get plpgsql to realize that
> composite domain local variables have substructure. For example,
> this works:
>
> declare x complex;
> ...
> x.r := 1;
>
> but it fails if x is dcomplex. But ISTM that that would be better
> handled as a followon feature patch. I suspect that the other PLs may
> have similar issues where it'd be nice to allow domain-over-composite
> to act like a plain composite for specific purposes; but I've not looked.
>
> Another issue related to function result types is that the parser
> considers a function-in-FROM returning a composite domain to be
> producing a scalar result not a rowtype. Thus you get this for a
> function returning complex:
>
> regression=# select * from fc();
> r | i
> ---+---
> 1 | 2
> (1 row)
>
> but this for a function returning dcomplex:
>
> regression=# select * from fdc();
> fdc
> -------
> (1,2)
> (1 row)
>
> I think that that could be changed with only local changes in parse
> analysis, but do we want to change it? Arguably, making fdc() act the
> same as fc() here would amount to implicitly downcasting the domain to
> its base type. But doing so here is optional, not necessary in order to
> make the statement sane at all, and it's arguable that we shouldn't do
> that if the user didn't tell us to. A user who does want that to happen
> can downcast explicitly:
>
> regression=# select * from cast(fdc() as complex);
> r | i
> ---+---
> 1 | 2
> (1 row)
>
> (For arcane syntactic reasons you can't abbreviate CAST with :: here.)
> Another point is that if you do want the domain value as a domain
> value, and not smashed to its base type, it would be hard to get at
> if the parser acts this way --- "foo.*" would end up producing the base
> rowtype, or if it didn't, we'd have some issues with the previously
> noted rule about whole-row Vars never having domain types.
>
> So there's a case to be made that this behavior is fine as-is, but
> certainly you could also argue that it's a POLA violation.
>
> Digression: one reason I'm hesitant to introduce inessential reductions
> of domains to base types is that I'm looking ahead to arrays over
> domains, which will provide a workaround for the people who complain
> that they wish 2-D arrays would work type-wise like arrays of 1-D array
> objects. If you "create domain inta as int[]" then inta[] would act
> like an array of array objects, mostly solving the problem I think.
> But it solves the problem only because we don't consider that a domain
> is indistinguishable from its base type. It's hard to be sure without
> having done the work yet, but I think there will be cases where being
> over-eager to treat a domain as its base type might break the behavior
> we want for that case. So I don't want to create a precedent for that
> here.
>
> Thoughts?
>

This is a pretty nice patch, and very small indeed all things
considered. From a code point of view I have no criticism, although
maybe we need to be a bit more emphatic in the header file comments
about the unwisdom of using get_expr_result_tupdesc().

I do think that treating a function returning a domain-over-composite
differently from one returning a base composite is a POLA. We'd be very
hard put to explain the reasons for it to an end user.

I also think we shouldn't commit this until we have accompanying patches
for the core PLs, at least for plpgsql but I bet there are things that
should be fixed for the others too.

cheers

andrew

--
Andrew Dunstan https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message chenhj 2017-09-28 15:44:34 Re: [PATCH]make pg_rewind to not copy useless WAL files
Previous Message Alexander Kuzmenkov 2017-09-28 15:27:09 Re: PoC: full merge join on comparison clause