Re: ORDER BY and NULLs

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: mailreg(at)numerixtechnology(dot)de
Cc: Jean-Luc Lachance <jllachan(at)sympatico(dot)ca>, pgsql-sql(at)postgresql(dot)org
Subject: Re: ORDER BY and NULLs
Date: 2004-09-19 16:58:57
Message-ID: 19733.1095613137@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

T E Schmitz <mailreg(at)numerixtechnology(dot)de> writes:
> You must've been reading my mind. I was just wondering what to do about
> indexing on that particular table. I read somewhere that an Index is not
> going to improve the performance of an ORDER BY if the sort column
> contains NULLs because NULLs aren't indexed?

Whatever you were reading had it pretty badly garbled :-(

Btree indexes *do* store nulls, so the presence of nulls doesn't affect
whether they are usable for meeting an ORDER BY spec. However the index
sort order does have to exactly match the ORDER BY list, and even then
it's not necessarily the case that the index is useful. The brutal fact
is that seqscan-and-sort is generally faster than a full-table indexscan
for large tables anyway, unless the table is clustered or otherwise
roughly in order by the index.

If you are going to use an ORDER BY that involves COALESCE or NOT NULL
expressions, then the only way that it could be met with an index is to
create an expressional index on exactly that list of expressions. For
instance

regression=# create table foo (f int, t int);
CREATE TABLE
regression=# explain select * from foo order by f, coalesce(t, -1);
QUERY PLAN
-------------------------------------------------------------
Sort (cost=69.83..72.33 rows=1000 width=8)
Sort Key: f, COALESCE(t, -1)
-> Seq Scan on foo (cost=0.00..20.00 rows=1000 width=8)
(3 rows)

regression=# create index fooi on foo (f, (coalesce(t, -1)));
CREATE INDEX
regression=# explain select * from foo order by f, coalesce(t, -1);
QUERY PLAN
--------------------------------------------------------------------
Index Scan using fooi on foo (cost=0.00..52.00 rows=1000 width=8)
(1 row)

regression=#

I'm a bit dubious that such an index would be worth its update costs,
given that it's likely to be no more than a marginal win for the query.
But try it and see.

> Jean-Luc Lachance wrote:
>> If you have large amount of rows (with or without nulls) it is faster if
>> use a partial index.

This advice seems entirely irrelevant to the problem of sorting the
whole table...

regards, tom lane

In response to

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message Tom Lane 2004-09-19 17:12:07 Re: How to check postgres running or not ?
Previous Message Andrew Sullivan 2004-09-19 16:57:48 Re: How to check postgres running or not ?