Re: Why does a simple query not use an obvious index?

From: Mr Pink <mr_pink_is_the_only_pro(at)yahoo(dot)com>
To: Jack Kerkhof <jack(dot)kerkhof(at)guest-tek(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Why does a simple query not use an obvious index?
Date: 2004-08-29 18:04:48
Message-ID: 20040829180448.41247.qmail@web41108.mail.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Strangely enough, I don't find that result surprising.

if the vast bulk of the data is in the past and now()-60 represents a very small slice of the data
we might expect that using an index is optimal, but there could be many reasons why it doesn't get
used.

AFAIK postgres doesn't peek at values used in a query when optimizing so any query with a ">" type
condition is gonna have a seq scan as the plan since the best guess is that you are gonna match
50% of the table. That's one possible explanation.

Another is that if the condition data types don't match then an indes won't be used you could try:

select count(*) from billing where timestamp > (now()-60)::timestamp

Might make a difference, I dunno, it's a case of testing amd seing what happens.

You could try lowering the random page cost, it might help, but I don't like your chances.

If your problem is that you want to access the most recent data from a large table with fast
response, then you could consider:

1. a "recent" index. If the data is within the "recent" time from set a flag to true, other wise
null. Reset the flags periodically. Nulls aren't indexed so the selectivity of such an index is
much higher. Can work wonders.

2, duplicate recent data in another table that is purged when data passes the age limit. This is
basic archiving.

Something like that. Hopefully someone with more knowlege of the optimaizer will have a brighter
suggestion for you.

What version are you using by the way?

Regards
Mr Pink

--- Jack Kerkhof <jack(dot)kerkhof(at)guest-tek(dot)com> wrote:

> The query:
> select count(*) from billing where timestamp > now()-60
>
> should obviously use the index
>
> CREATE INDEX billing_timestamp_idx ON billing USING btree ("timestamp"
> timestamp_ops);
>
> on a table with 1400000 rows.
>
> But it uses a Seq Scan. If I set enable_seqscan=no, it indicates a queryplan
> could not be calculated.
>
> Why does this simple query not use the timestamp index, and how can I get it
> to?
>
> Thanks, Jack
>
> Jack Kerkhof
> Research & Development
> jack(dot)kerkhof(at)guest-tek(dot)com
> www.guest-tek.com
> 1.866.509.1010 3480
>
> --------------------------------------------------------------------------
>
> Guest-Tek is a leading provider of broadband technology solutions for
> the hospitality industry. Guest-Tek's GlobalSuiteEhigh-speed Internet
> solution enables hotels to offer their guests the convenience of wired
> and/or wireless broadband Internet access from guest rooms, meeting rooms
> and public areas.
>
>
>


__________________________________
Do you Yahoo!?
Take Yahoo! Mail with you! Get it on your mobile phone.
http://mobile.yahoo.com/maildemo

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Scott Marlowe 2004-08-29 18:28:48 Re: Why does a simple query not use an obvious index?
Previous Message Josh Berkus 2004-08-28 21:34:23 Re: Performance hit on loading from HD