Re: Index not used with IS NULL

From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Dmitry Tkach <dmitry(at)openratings(dot)com>, pgsql-general(at)postgresql(dot)org
Subject: Re: Index not used with IS NULL
Date: 2003-02-18 02:20:29
Message-ID: 20030218022029.GB23778@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Mon, Feb 17, 2003 at 08:46:17PM -0500, Tom Lane wrote:
> Martijn van Oosterhout <kleptog(at)svana(dot)org> writes:
> > I would have thought that the other index type supported null anyway, for
> > the purposes of uniqueness checks.
>
> Well, (a) the other index types don't support uniqueness checks, and (b)
> it wouldn't be relevant anyway, because multiple nulls don't violate
> a unique constraint. GIST does support nulls in second and subsequent
> columns of a multi-column index, because it *has* to do so, but not in
> the first column --- and hash and rtree don't store nulls at all.

I stand corrected. I just tested it here and multiple nulls in a unique
column indeed do work.

> > I remember looking into this a while ago. My solution to that problem was
> > that x =3D NULL is always NULL and so doesn't need to go through the scan
> > anyway (index or sequential). Once you've taken care of the x =3D NULL case
> > elsewhere, you can use the available state for x IS NULL.
>
> But how do you get from point A to point B? You need to represent both
> cases in ScanKeys further upstream than where that conclusion can be
> drawn (namely _bt_orderkeys()) --- or else do some very substantial
> restructuring work, which is exactly the point.
>
> Also, this would amount to hard-wiring the assumption that indexable
> operators are always strict. Which is rather a curious assumption
> to be putting in, if your goal is to support the obviously-not-strict
> construct IS NULL as an indexable operator. (Now I believe we make
> that assumption anyway in the index access methods ... but wiring it
> into ScanKeys, which is a very widespread data structure, would be the
> death knell for any hope of removing it someday.)

I hadn't thought of that. While I can't think of a situation of a
non-strict indexable operator, I wouldn't want to rule it out.

My Plan B was to create a operator IS (and its inverse ISNOT) which is then
binary operator. It would be identical to = and <> except that it would be
defined where either argument is NULL. Fiddle the parser to use this
operator instead of the unary ISNULL. The disadvantage is that (unless you
restrict it in the parser) you could say things like:

SELECT * FROM x, y WHERE x.field IS y.field

Allowing you to join on NULL fields. This is not allowed by the spec. Do you
think this would be a better approach? Or is there something special about
the ISNULL in SQL does means this cannot work? It does seem a bit wasteful
to have an operator whose second argument is always NULL (unless you allow
the extra syntax).

As a bonus, if this could be made to work, you *know* your index operators
don't need to be strict.

--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> Support bacteria! They're the only culture some people have.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Weiping He 2003-02-18 02:46:52 question about PERFORM and EXECUTE in plpgsql
Previous Message Tom Lane 2003-02-18 01:54:41 Re: foreign key constraints and alter table