Re: Cost of indexscan

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Kari Lavikka <tuner(at)bdb(dot)fi>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Cost of indexscan
Date: 2004-01-30 15:07:34
Message-ID: 13308.1075475254@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Kari Lavikka <tuner(at)bdb(dot)fi> writes:
> Postgres seems to estimate the cost of indexscan to be a bit too high.
> The table has something like 500000 rows and I have run reindex and vacuum
> analyze recently. Is there something to tune?

I think the real problem here is that the row estimate is off by a
factor of thirty:

> Seq Scan on admin_event (cost=0.00..19844.37 rows=154361 width=109) (actual time=479.173..2760.186 rows=4705 loops=1)

With a correct estimate the indexscan would have been chosen.

> galleria=> explain analyze SELECT * FROM admin_event WHERE stamp > (current_timestamp - '1 days'::INTERVAL)::TIMESTAMP WITHOUT TIME ZONE;

It's not possible for the planner to make a good guess here since it
doesn't know what the comparison value for the stamp column is.
(current_timestamp isn't a constant and so the comparison expression
can't be reduced to a constant at plan time.)

The traditional solution for this is to cheat:

create function ago(interval) returns timestamp without time zone as
'select localtimestamp - $1' language sql strict immutable;

select * from admin_event where stamp > ago('1 days');

This works because the function is mislabeled as immutable, encouraging
the planner to fold the result to a constant on sight. It also has the
pleasant property of making your query more readable. The downside is
that you are in fact lying to the system about the behavior of the ago()
function, and so you can get in trouble. This only really works for
queries executed interactively --- you can't use this method inside
plpgsql functions, for instance.

> Distribution of stamp looks like the following:

Hm, you might also find that increasing the statistics target for stamp
would be a good idea, since its distribution is so skewed. But unless
you do something like the above, the statistics won't get used anyway...

regards, tom lane

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Jack Coates 2004-01-30 17:13:19 Re: query optimization question
Previous Message Tom Lane 2004-01-30 14:51:10 Re: query optimization question