Re: Assigning ROW variable having NULL value to RECORD type variable doesn't give any structure to the RECORD variable.

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Assigning ROW variable having NULL value to RECORD type variable doesn't give any structure to the RECORD variable.
Date: 2020-01-03 20:39:39
Message-ID: 2823.1578083979@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Wed, Jan 1, 2020 at 10:50 AM Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com> wrote:
>> I know this is expected to happen considering the changes done in
>> above commit because from this commit onwards, NULL value assigned to
>> any row variable represents a true NULL composite value before this
>> commit it used to be a tuple with each column having null value in it.
>> But, the point is, even if the row variable is having a NULL value it
>> still has a structure associated with it. Shouldn't that structure be
>> transferred to RECORD variable when it is assigned with a ROW type
>> variable ? Can we consider this behaviour change as a side effect of
>> the improvement done in the RECORD type of variable?

> I'm not an expert on this topic. However, I *think* that you're trying
> to distinguish between two things that are actually the same. If it's
> a "true NULL," it has no structure; it's just NULL. If it has a
> structure, then it's really a composite value with a NULL in each
> defined column, i.e. (NULL, NULL, NULL, ...) for some row type rather
> than just NULL.

Yeah. In general, we can't do this, because a null value of type
RECORD simply hasn't got any information about what specific rowtype
might be involved. In the case where the null is of a named composite
type, rather than RECORD, we could choose to act differently ... but
I'm not really sure that such a change would be an improvement and not
just a decrease in consistency.

In any case, plpgsql's prior behavior was an implementation artifact
with very little to recommend it. As a concrete example, consider

create table t1(a int, b text);

do $$
declare x t1; r record;
begin
x := null;
r := x;
raise notice 'r.a = %', r.a;
end $$;

do $$
declare r record;
begin
r := null::t1;
raise notice 'r.a = %', r.a;
end $$;

I assert that in any sanely-defined semantics, these two examples
should give the same result. In v11 and up, they both give
'record "r" is not assigned yet' ... but in prior versions, they
gave different results. I do not want to go back to that.

On the other hand, we now have

do $$
declare x t1; r record;
begin
x := null;
r := x;
raise notice 'x.a = %', x.a;
raise notice 'r.a = %', r.a;
end $$;

which gives

NOTICE: x.a = <NULL>
ERROR: record "r" is not assigned yet

which is certainly also inconsistent. The variable declared as
being type t1 behaves, for this purpose, as if it contained
"row(null,null)" not just a simple null. But if you print it,
or assign it to something else as a whole, you'll find it just
contains a simple null. One way to see that these are different
states is to do

do $$ declare x t1; begin x := null; raise notice 'x = %', x; end$$;
NOTICE: x = <NULL>

versus

do $$ declare x t1; begin x := row(null,null); raise notice 'x = %', x; end$$;
NOTICE: x = (,)

And, if you assign a row of nulls to a record-type variable, that works:

do $$
declare x t1; r record;
begin
x := row(null,null);
r := x;
raise notice 'x.a = %', x.a;
raise notice 'r.a = %', r.a;
end $$;

which gives

NOTICE: x.a = <NULL>
NOTICE: r.a = <NULL>

If we were to change this behavior, I think it would be tantamount
to sometimes expanding a simple null to a row of nulls, and I'm
not sure that's a great idea.

The SQL standard is confusing in this respect, because it seems
that at least the "x IS [NOT] NULL" construct is defined to
consider both a "simple NULL" and ROW(NULL,NULL,...) as "null".
But we've concluded that other parts of the spec do allow for
a distinction (I'm too lazy to search the archives for relevant
discussions, but there have been some). The two things are
definitely different implementation-wise, so it would be hard
to hide the difference completely.

Another fun fact is that right now, assignment of any null value
to a composite plpgsql variable works the same: you can assign a simple
null of some other composite type, or even a scalar null, and behold you
get a null composite value without any error. That's because
exec_assign_value's DTYPE_REC case pays no attention to the declared
type of the source value once it's found to be null. Thus

do $$ declare x t1; begin x := 42; raise notice 'x = %', x; end$$;
ERROR: cannot assign non-composite value to a record variable

do $$ declare x t1; begin x := null::int; raise notice 'x = %', x; end$$;
NOTICE: x = <NULL>

That's pretty bizarre, and I don't think I'd agree with adopting those
semantics if we were in a green field. But if we start paying attention
to the specific type of a null source value, I bet we're going to break
some code that works today.

Anyway, maybe this area could be improved, but I'm not fully convinced.
I definitely do not subscribe to the theory that we need to make it
work like v10 again.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Merlin Moncure 2020-01-03 20:51:37 Re: Greatest Common Divisor
Previous Message Robert Haas 2020-01-03 20:30:47 Re: global / super barriers (for checksums)