Re: (FAQ?) JOIN condition - 'WHERE NULL = NULL'

From: Ted Byers <r(dot)ted(dot)byers(at)rogers(dot)com>
To: Martijn van Oosterhout <kleptog(at)svana(dot)org>, Ian Sillitoe <ian(dot)sillitoe(at)googlemail(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: (FAQ?) JOIN condition - 'WHERE NULL = NULL'
Date: 2008-04-02 17:48:28
Message-ID: 831920.14120.qm@web88303.mail.re4.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

--- Martijn van Oosterhout <kleptog(at)svana(dot)org> wrote:
> 'transform_null_equals' won't help you at all here
> since it only help
> in the very specific case of comparing with a
> constant. The easiest is
> to think of NULL as meaning 'unknown'. Clearly you
> can't compare that
> usefully with anything.
>
Not even a null in another record ... (hence my
question below). If the value is unknown, then it
could be anything, and (thinking as a mathematician
considering real numbers) the probability of two
records having null having their true, but unknown
values be the same is indistinguishable from 0. (with
integers or decimal numbers or floating point numbers,
that would be qualified with the clause, for practical
purposes :)

> Perhaps you can use a marker like -1 to achieve the
> effect you want?
>
Is that really valid, though, especially in a join? I
mean, if the column in question has multiple nulls, in
each of the tables, then how do you, of the DB, figure
out which of the rows containing nulls in the one
table match up with rows in the other table containing
nulls? Or is the resultset supposed to be the product
of the two sets (match each row with a null in the one
table with each row with a null in the other)? That,
for me, creates a nightmare situation where some of my
tables have tens of millions of rows, and if even 1%
of the rows contains null in the relevant column, I
don't even want to think about processing the
resultset that would be produced from such an idea
using these tables.

My joins always only involve primary keys, or indeces
on columns that prohibit nulls, so this problem
doesn't crop up in my code, but would I be not too far
from right in expecting that the rational thing to do
when creating a join on columns that allow nulls is to
exclude ALL rows, in either table, where the columns
involved are null?

Cheers,

Ted

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Rodrigo E. De León Plicet 2008-04-02 18:07:04 Re: Problem with planner choosing nested loop
Previous Message mailtolouis2020-postgres 2008-04-02 17:38:45 timestamp problem