Re: group by will not use an index?

From: "Adam Rich" <adam(dot)r(at)sbcglobal(dot)net>
To: "'tsuraan'" <tsuraan(at)gmail(dot)com>, "'pgsql-performance'" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: group by will not use an index?
Date: 2007-01-09 23:32:50
Message-ID: 031b01c73446$7a4250f0$6400a8c0@dualcore
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

That query looks strange to me (a group by without an aggregate). See
if this is
any faster:

SELECT DISTINCT DATE(inserted) FROM Messages

I won't hold my breath though, I don't think there's any way around the
full table scan
in Postgres, because the index does not contain enough information about
transactional
state, so table access is always required (unlike virtually every other
type of db)

-----Original Message-----
From: pgsql-performance-owner(at)postgresql(dot)org
[mailto:pgsql-performance-owner(at)postgresql(dot)org] On Behalf Of tsuraan
Sent: Tuesday, January 09, 2007 5:06 PM
To: pgsql-performance
Subject: [PERFORM] group by will not use an index?

I have a table of messages with paths and inserted dates (among other
things), like so:

CREATE TABLE Messages (
msgkey BIGSERIAL PRIMARY KEY,
path TEXT NOT NULL,
inserted TIMESTAMP WITHOUT TIMEZONE DEFAULT NOW()
);

I run a query to determine which days actually saw emails come in, like
so:

SELECT DATE(inserted) FROM Messages GROUP BY DATE(inserted);

That's obviously not very efficient, so I made an index:

CREATE INDEX messages_date_inserted_ind ON Messages(DATE(inserted));

However, GROUP BY does not use this index:

=# explain analyze select date(inserted) from messages group by
date(inserted);
QUERY PLAN

------------------------------------------------------------------------
------------------------------------------------------
HashAggregate (cost=104773.10..104789.51 rows=1313 width=8) (actual
time=31269.476..31269.557 rows=44 loops=1)
-> Seq Scan on messages (cost=0.00..101107.25 rows=1466340 width=8)
(actual time=23.923..25248.400 rows=1467036 loops=1)
Total runtime: 31269.735 ms
(3 rows)

Is it possible to get pg to use an index in a group by? I don't see why
it wouldn't be possible, but maybe I'm missing something.

Using pg 8.1.4...

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Gregory S. Williamson 2007-01-09 23:40:51 Re: Horribly slow query/ sequential scan
Previous Message tsuraan 2007-01-09 23:05:48 group by will not use an index?