Skip site navigation (1) Skip section navigation (2)

Re: Why does a simple query not use an obvious index?

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Jack Kerkhof" <jack(dot)kerkhof(at)guest-tek(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Why does a simple query not use an obvious index?
Date: 2004-08-29 22:03:43
Message-ID: 16093.1093817023@sss.pgh.pa.us (view raw or flat)
Thread:
Lists: pgsql-performance
>> select somefield from sometable where timestampfield > now()-'60
>> seconds'::interval

This is a FAQ, but since the archives don't seem to be up at the moment,
here's the answer once again:

The expression "now() - something" is not a constant, so the planner
is faced with "timestampfield > unknownvalue".  Its default assumption
about the number of rows that will match is much too high to make an
indexscan look profitable (from memory, I think it guesses that about
a third of the table will match...).

There are a couple of hacks you can use to deal with this.  Plan A
is just "set enable_seqscan = false" for this query.  This is ugly and
not really recommended, but you should try it first to verify that you
do get an indexscan that way, just to be sure that lack of statistics
is the problem and not something else.

Plan B is to add an extra WHERE clause to make the problem look like a
range query, eg

	where timestampfield > now() - ... AND timestampfield <= now();

The planner still doesn't know the exact values involved, so it still
can't make use of any statistics, but it can see that this is a range
constraint on timestampfield.  The default guess about the selectivity
will be a lot smaller than in the case of the one-sided inequality,
and in most cases you should get an indexscan out of it.  This isn't
completely guaranteed though.  Also, it's got a severe problem in that
if you sometimes do queries with a large interval, it'll still do an
indexscan even though that may be quite inappropriate.

Plan C is to fix things so that the compared-to value *does* look like a
constant; then the planner will correctly observe that only a small part
of the table is to be scanned, and do the right thing (given reasonably
up-to-date ANALYZE statistics, anyway).  The most trustworthy way of
doing that is to compute the "now() - interval" value on the client side
and send over a timestamp constant.  If that's not convenient for some
reason, people frequently use a hack like this:

	create function ago(interval) returns timestamptz as
	'select now() - $1' language sql strict immutable;

	select ... where timestampfield > ago('60 seconds');

This is a kluge because you are lying when you say that the result of
ago() is immutable; it obviously isn't.  But the planner will believe
you, fold the function call to a constant during planning, and use the
result.  CAUTION: this works nicely for interactively-issued SQL
queries, but it will come back to bite you if you try to use ago() in
prepared queries or plpgsql functions, because the premature collapsing
of the now() result will become significant.

We have speculated about ways to get the planner to treat expressions
involving now() and similar functions as pseudo-constants, so that it
would do the right thing in this sort of situation without any kluges.
It's not been done yet though.

BTW, the above discussion applies to PG 7.3 and later; if you're dealing
with an old version then there are some different considerations.

			regards, tom lane

In response to

Responses

pgsql-performance by date

Next:From: Steinar H. GundersonDate: 2004-08-29 22:04:13
Subject: Re: Why does a simple query not use an obvious index?
Previous:From: Scott MarloweDate: 2004-08-29 21:44:15
Subject: Re: Why does a simple query not use an obvious index?

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group