Re: Meeting recap - Logic and Databases with Jeff Davis

From: "David E(dot) Wheeler" <david(at)kineticode(dot)com>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: PDX PostgreSQL Users <pdxpug(at)postgresql(dot)org>
Subject: Re: Meeting recap - Logic and Databases with Jeff Davis
Date: 2008-06-24 05:29:06
Message-ID: F7701C19-4DAE-4ECC-82C9-414E2E55B6ED@kineticode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pdxpug

On Jun 22, 2008, at 10:49, Jeff Davis wrote:

> Compare all those problems with something simple like just having the
> tables: approval, purchase, and shipment_arrival. In that case, it's
> simple to draw arbitrary implications from the data without awkward
> exclusions and CASE statements (or any of the problems above).

Oh, yes, if you can get a schema that better maps the structure of the
data without having to add a state, then fantastic.

> Someone querying the data only needs to understand the predicates of
> the
> relations, they don't need to understand what the states mean, nor do
> they need to work backward from the meaning of the states to some kind
> of query that properly handles the various states.
>
> The problems with states stem from the fact that relational
> expressions
> (and SQL) are declarative, but states imply some kind of state
> machine.
> The states pull us away from declarative language and force us into
> imperative language.

Yes, nicely put.

> Maybe I should write a blog entry about this.

If you do, be sure to include that last paragraph. It's a keeper. ;-)

> I would argue that your data dictionary should match your relations.
> If
> you have useful definitions in your data dictionary, why not make the
> relations match, so that you can easily make logical inferences using
> the relational operators?
>
> Even a CSV file can have a data dictionary. I don't think a data
> dictionary is a justification for a weakness in a database design
> (although it's certainly better than nothing).

Fair enough.

>> Boy, that sure seems like a bug. This should be legal, though:
>
> It's not just a bug, it's a standardized bug.

Yes, and one best avoided where possible.

> SQL is confusing because it uses NULLs in at least two senses:
> (1) Unknown. This is a value: the third truth value. Operators,
> functions, and IN all think this is what NULL means.

It's how I tend to use it, as well. Except where I can't, of course.

> (2) Nothingness. This is not a value. Aggregates and outer joins think
> this is what NULL means.
>
> COALESCE can obviously help you switch between those two senses of
> NULL,
> but why should you have to? Why should you get a result that *looks*
> correct from a query that *looks* correct when it's actually wrong?

Because the standard is fucked up?

> My talk slides have a full example that illustrates such a query:
> http://www.pgcon.org/2008/schedule/events/83.en.html
>
> Any language can do anything, so I don't think of a workaround like
> COALESCE as a justification for the bad standard behavior.

No, of course not. But at least it's there. The standard isn't likely
to change.

>> Your explanation makes perfect sense. It's just that the first case
>> seems wrong (to me, at least).
>
> It is wrong -- or at least horribly inconsistent. If SQL really wanted
> to have both unknown and nothingness, it should have called them two
> separate things.

Absolutely.

>> Yeah. COALESCE() is your friend. This is also why I try to make
>> columns NOT NULL as often as possible. NULLs are pretty evil.
>
> I'll take this opportunity to point out that COALESCE has similar
> problems to using states: it requires special cases and the result
> always looks right even when it's wrong.
>
> Compare to nil in Ruby. Every operator in SQL is defined for NULL
> input
> in SQL, but virtually no operators are defined for nil input in Ruby.
> This means that a wrong handling of the special value nil in Ruby will
> almost always result in an error, but in SQL will produce a
> correct-looking result.
>
> Of course, using special cases (like nil) is still imperative
> programming, but at least it's slightly less error prone.
>
> When NULL is used in two very different senses like that in SQL, it
> effectively makes it an untyped system like assembly.

Maybe this is why CJ Date hates NULLs so much. Or at least one reason
why.

Best,

David

In response to

Browse pdxpug by date

  From Date Subject
Next Message David E. Wheeler 2008-06-24 05:30:38 Re: Meeting recap - Logic and Databases with Jeff Davis
Previous Message Jeff Davis 2008-06-23 00:16:54 Re: Meeting recap - Logic and Databases with Jeff Davis