Re: When is a record NULL?

From: Sam Mason <sam(at)samason(dot)me(dot)uk>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: When is a record NULL?
Date: 2009-07-24 22:15:20
Message-ID: 20090724221520.GE5407@samason.me.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jul 24, 2009 at 12:30:39PM -0500, Kevin Grittner wrote:
> Sam Mason <sam(at)samason(dot)me(dot)uk> wrote:
> > The fact that it happens to be a NULL *value* in one case
>
> Well, according to Codd (and I tend to go with him on this) there is
> no such thing. NULL is a way to flag a place where a value could be
> stored, but is not -- because is unknown or is not applicable in that
> context. (He seemed to feel it was a big weakness of SQL that it
> didn't differentiate between these two conditions, but that's another
> argument.) "NULL value" is an oxymoron.

I think then maybe we're talking about different things; I was trying
to draw attention to the distinction between types and values---types
allow some invariants of the code to be automatically checked before it
is run, a value only has meaning at run time with the set of possible
values an expression is defined over being constrained by its type.

In this dichotomy a NULL is most definitely a value and with my current
experience I don't understand the distinction you're trying to draw.

> >> SQL doesn't provide a test for this case that's separate from the
> >> test involving null-ness of individual fields. Not much we can do
> >> about it though. I'm not entirely sure that exposing the
> >> distinction would be helpful anyway ...
> >
> > I think it would
>
> The distinction between not having a tuple and having a tuple for
> which you don't know any applicable values seems thin. I'm not sure
> what that would really mean.

Other languages/type systems do define this precisely. For example,
in object orientated languages there's a big difference between a
reference to an object being NULL and some member of an object being
NULL. Databases obviously have their own semantics, but the distinction
is well defined.

Any implementation that tries to be faithful to a standard has its hands
somewhat tied and PG is no exception. PG currently seems to be some
hybrid half way between, it internally knows there is a distinction
between the two but it doesn't like to expose this. For example (and
this appears particularly awkward because of annoying limitations in the
syntax PG accepts):

SELECT y
FROM (SELECT 1) x(a)
LEFT JOIN (SELECT 1,2) y(a,b) ON FALSE;

I think it should be valid to express this as:
SELECT (SELECT 1,2 WHERE FALSE);
but PG doesn't like sub-queries returning two columns--but this is
material for another discussion.

This returns a single row whose only attribute is NULL (i.e. it's
rendered as '') and not as a record whose attributes are all NULL (i.e.
rendered as '(,)'). The fact PG does the former says that your mental
model isn't congruent with PGs behavior.

If your model is correct then when the IS DISTINCT FROM operator works
on RECORDs the following should return FALSE for all of the following:

SELECT NULL IS DISTINCT FROM ROW(NULL);
SELECT NULL IS DISTINCT FROM ROW(NULL,NULL);
SELECT NULL IS DISTINCT FROM ROW(NULL,ROW(NULL,NULL));
SELECT ROW(NULL) IS DISTINCT FROM ROW(NULL,ROW(NULL,NULL));

i.e. there is *no* difference between a NULL record and a record
consisting entirely of NULLs.

--
Sam http://samason.me.uk/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2009-07-24 22:18:47 Re: contrib/xml2 pfree bug
Previous Message David E. Wheeler 2009-07-24 22:05:27 Re: When is a record NULL?