> Date and Pascal hate nulls. One even goes so far as to say that if you
> permit NULLs in a database, then the results from *every* query is
> suspect. So they turn perform backflips suggesting ways to avoid nulls.
> None, so far, seem appealing.
> To me, nulls are quite useful in the Real World. For instance, there
> may be a lot of immediate value for end users in committing a row that
> has a few nulls (e.g. as in not yet collected), rather than slavishly
> follow a rule that says Thou Shalt Not commit a row with nulls.
> Can't the intelligent practitioner simply proceed carefully with
> queries when nulls are involved? With group functions at least, I
> believe nulls are ignored. In Oracle, you can use NVL() to force group
> functions to recognize.
> What about simply doing an IS NULL test in the code when any table that
> allows nulls in involved in a query?
> What precisely has Date and Pascal's knickers in such a twist? The fact
> that ad hoc queries from random, unintelligent users could give strange
> What if one has control over the queries performed through a GUI or
> application? Doesn't the problem disappear, presuming the programmer is
> aware of the issue and the application is well documented?
> What are some of the best ways to deal with the issue of nulls
> potentially resulting in questionable query results short of
> disallowing them? Storing some sort of coded or numeric value that
> represents UNKNOWN or NOT APPLICABLE.
I'd say the "problem" with NULL values is mainly that they conflict with
some of the ideas of relational theory and relational algebra.
One of the basic ideas of relation theory is that of functional
dependencies. Database normalization (at least according to the theory)
tells you how to deduce a "good" schema, if you know what columns you
need, and what their functional dependencies are.
Of course, the functions described by those functional dependencies are
not required to be defined for every possible value - let's say you have
a function dependency A -> B - meaning that whenever you know the value
of column A, then there is _at_most_ one value for column BNormalization
basically tells you to model that function dependency as a
table containing fields A and B, and make A the primary key.
Now, if there is no B for a specific value of A, then this table will
just not contain a record for this value of A. But if you allow
NULL-values, then suddently there are _two_ different ways to express
"I don't know what B is for this A". You could either have a record with
the A-value in question, and with B null, or you could have _no_ record
with the A-value in question.
So, NULLs IMHO give you flexibility - but at a price. The price is that
the _same_ information could be stored in different ways - and you can't
really deduce "the correct way" from the schema alone.
For me, the discussion is quite similar to static-typed vs. dynamically
typed languages - the first ones enable the compiler to check your code
more thoroughly, while the second ones often allow you to write more
So, I'd say Date and Pascal are right from a _theoretical_ point of view
- null values really cause problems in the context of relational theory.
Those theoretical problems in turn cause practical problems to some
extent - but so do a _lot_ of other things. Take java as an example -
at least until java 1.4 this language has huge theoretical deficiencies
in it's type system, but you can nevertheless write good and
maintainable code in java 1.4. The same is true for SQL with nulls -
there _are_ possibilities to shoot yourself in the foot, but that just
means that developers need to be skilled enough to know about those
greetings, Florian Pflug
In response to
pgsql-general by date
|Next:||From: Devrim GUNDUZ||Date: 2006-06-28 13:10:30|
|Subject: Re: error while installing the 8.1.4 rpms on my m/c|
|Previous:||From: surabhi.ahuja||Date: 2006-06-28 12:50:07|
|Subject: error while installing the 8.1.4 rpms on my m/c|