Re: Domains and arrays and composites, oh my

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Domains and arrays and composites, oh my
Date: 2017-07-13 19:19:14
Message-ID: 12364.1499973554@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I wrote:
> I started to look into allowing domains over composite types, which is
> another never-implemented case that there's no very good reason not to
> allow. Well, other than the argument that the SQL standard only allows
> domains over "predefined" (built-in) types ... but we blew past that
> restriction ages ago.

Attached is a draft patch that allows domains over composite types.
I think it's probably complete on its own terms, but there are some
questions around behavior of functions returning domain-over-composite
that could use discussion, and some of the PLs need some follow-on work.

The core principle here was to not modify execution-time behavior by
adding domain checks to existing code paths. Rather, I wanted the
parser to insert CoerceToDomain nodes wherever checks would be needed.
Row-returning node types such as RowExpr and FieldStore only return
regular composite types, as before; to generate a value of a composite
domain, we construct a value of the base type and then CoerceToDomain.
(This is pretty analogous to what happens for domains over arrays.)
Whole-row Vars can only have regular composite types as vartype,
TupleDescs can only have regular composite types as tdtypeid, composite
Datums only have regular composite type OIDs in datum_typeid. (The last
would be expected anyway, since the physical representation of a domain
value is never different from that of its base type.)

Doing it that way led to a nicely small patch, only about 160 net new
lines of code. However, not touching the executor meant not touching
the behavior of functions that return domains, even if the domain is
domain-over-composite. In code terms this means that
get_call_result_type() and sibling functions will return TYPEFUNC_SCALAR
not TYPEFUNC_COMPOSITE for such a result type. The difficulty with
changing that is that if these functions look through the domain, then
the calling code (in, usually, a PL) will simply build and return a result
of the underlying composite type, failing to apply any domain constraints.
Trying to get out-of-core PLs on board with a change in those requirements
seems like a risky proposition.

Concretely, consider

create type complex as (r float8, i float8);
create domain dcomplex as complex;

You can make a SQL-language function to return complex in either of two
ways:

create function fc() returns complex language sql
as 'select 1.0::float8, 2.0::float8';

create function fc() returns complex language sql
as 'select row(1.0::float8, 2.0::float8)::complex';

As the patch stands, though, only the second way works for domains over
composite:

regression=# create function fdc() returns dcomplex language sql
as 'select 1.0::float8, 2.0::float8';
ERROR: return type mismatch in function declared to return dcomplex
DETAIL: Final statement must return exactly one column.
CONTEXT: SQL function "fdc"
regression=# create function fdc() returns dcomplex language sql
as 'select row(1.0::float8, 2.0)::dcomplex';
CREATE FUNCTION

Now, maybe that's fine. SQL-language functions have never been very
willing to insert implicit casts to get to the function result type,
and certainly the only way that the first definition could be legal
is if there were an implicit up-cast to the domain type. It might be
OK to just leave it like this, though some documentation about it
would be a good idea.

plpgsql functions seem generally okay as far as composite domain return
types go, because they don't have anything corresponding to the row
return convention of SQL functions. And plpgsql's greater willingness
to do implicit coercions reduces the notational burden, too. But
there's some work yet to be done to get plpgsql to realize that
composite domain local variables have substructure. For example,
this works:

declare x complex;
...
x.r := 1;

but it fails if x is dcomplex. But ISTM that that would be better
handled as a followon feature patch. I suspect that the other PLs may
have similar issues where it'd be nice to allow domain-over-composite
to act like a plain composite for specific purposes; but I've not looked.

Another issue related to function result types is that the parser
considers a function-in-FROM returning a composite domain to be
producing a scalar result not a rowtype. Thus you get this for a
function returning complex:

regression=# select * from fc();
r | i
---+---
1 | 2
(1 row)

but this for a function returning dcomplex:

regression=# select * from fdc();
fdc
-------
(1,2)
(1 row)

I think that that could be changed with only local changes in parse
analysis, but do we want to change it? Arguably, making fdc() act the
same as fc() here would amount to implicitly downcasting the domain to
its base type. But doing so here is optional, not necessary in order to
make the statement sane at all, and it's arguable that we shouldn't do
that if the user didn't tell us to. A user who does want that to happen
can downcast explicitly:

regression=# select * from cast(fdc() as complex);
r | i
---+---
1 | 2
(1 row)

(For arcane syntactic reasons you can't abbreviate CAST with :: here.)
Another point is that if you do want the domain value as a domain
value, and not smashed to its base type, it would be hard to get at
if the parser acts this way --- "foo.*" would end up producing the base
rowtype, or if it didn't, we'd have some issues with the previously
noted rule about whole-row Vars never having domain types.

So there's a case to be made that this behavior is fine as-is, but
certainly you could also argue that it's a POLA violation.

Digression: one reason I'm hesitant to introduce inessential reductions
of domains to base types is that I'm looking ahead to arrays over
domains, which will provide a workaround for the people who complain
that they wish 2-D arrays would work type-wise like arrays of 1-D array
objects. If you "create domain inta as int[]" then inta[] would act
like an array of array objects, mostly solving the problem I think.
But it solves the problem only because we don't consider that a domain
is indistinguishable from its base type. It's hard to be sure without
having done the work yet, but I think there will be cases where being
over-eager to treat a domain as its base type might break the behavior
we want for that case. So I don't want to create a precedent for that
here.

Thoughts?

Obviously this is work for v11; I'll go stick it in the next commitfest.

regards, tom lane

Attachment Content-Type Size
domains-over-composites-1.patch text/x-diff 29.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David G. Johnston 2017-07-13 19:27:26 Re: CAST vs ::
Previous Message Robert Haas 2017-07-13 19:13:33 Re: PostgreSQL - Weak DH group