BUG #2658: Query not using index

From: "Graham Davis" <gdavis(at)refractions(dot)net>
To: pgsql-bugs(at)postgresql(dot)org
Subject: BUG #2658: Query not using index
Date: 2006-09-27 20:56:32
Message-ID: 200609272056.k8RKuWjY040283@wwwmaster.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-performance


The following bug has been logged online:

Bug reference: 2658
Logged by: Graham Davis
Email address: gdavis(at)refractions(dot)net
PostgreSQL version: 8.1.4
Operating system: Linux
Description: Query not using index
Details:

I know that in version 8 you guys added support so that aggregate functions
can take advantage of indexes. However, I have a simple query that is not
taking advantage of an index where I believe it should.

I have a large table full of GPS positions. I want to query the table for
the most recent location of each asset (an asset is essentially a vehicle).
The ts column is the timestamp, so I am using this to figure out the most
recent position. I use the following query to do it:

SELECT assetid, max(ts) AS ts
FROM asset_positions
GROUP BY assetid;

I have an index on (ts), another index on (assetid) and a multikey index on
(assetid, ts). I know the assetid index is pointless since the multikey one
takes its place, but I put it there while testing just to make sure. The
ANALYZE EXPLAIN for this query is:

QUERY PLAN
----------------------------------------------------------------------------
-------------------------------------------------------------
HashAggregate (cost=125423.96..125424.21 rows=20 width=12) (actual
time=39693.995..39694.036 rows=20 loops=1)
-> Seq Scan on asset_positions (cost=0.00..116654.64 rows=1753864
width=12) (actual time=20002.362..34724.896 rows=1738693 loops=1)
Total runtime: 39694.245 ms
(3 rows)

You can see it is doing a sequential scan on the table when it should be
using the (assetid, ts) index, or at the very least the (ts) index. This
query takes about 40 seconds to complete with a table of 1.7 million rows.
I tested running the query without the group by as follows:

SELECT max(ts) AS ts
FROM asset_positions;

This query DOES use the (ts) index and takes less than 1 ms to complete. So
I'm not sure why my initial query is not using one of the indexes. I have
to use the GROUP BY in my query so that I get the max ts of EACH asset.

I've tried restructuring my query so that it will use an index, but nothing
seems to work. I tried this syntax for example:

SELECT DISTINCT ON (assetid) assetid, ts
FROM asset_positions
ORDER BY assetid, ts DESC;

It still does a sequential scan and takes 40+ seconds to complete. If I am
missing something here, please let me know, but I believe this is a bug that
needs addressing. If it is not a bug (and there just simply isn't support
for this with multikey indexes yet), please let me know so I can either try
restructuring the coding I am working on, or move on for now. The
documentation does not mention anything about this, but I know from reading
a list of changes in version 8 that this sort of support was added for
aggregate functions. If you need more information, please let me know,
thanks in advance.

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Christoph Zwerschke 2006-09-28 07:52:22 Re: BUG #2642: Connection problems without IPv6
Previous Message Tom Lane 2006-09-27 19:13:22 Re: BUG #2657: 8.2beta1: Startup failures on Windows platform

Browse pgsql-performance by date

  From Date Subject
Next Message Matthew Schumacher 2006-09-27 21:33:09 Re: Problems with inconsistant query performance.
Previous Message Jim C. Nasby 2006-09-27 20:31:55 Re: Problems with inconsistant query performance.