Re: Help with query: indexes on timestamps

From: "Keith C(dot) Perry" <netadmin(at)vcsn(dot)com>
To: Richard Huxton <dev(at)archonet(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Help with query: indexes on timestamps
Date: 2004-07-07 18:19:00
Message-ID: 1089224340.40ec3e946cf70@webmail.vcsn.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Quoting Richard Huxton <dev(at)archonet(dot)com>:

> Keith C. Perry wrote:
> >
> > I have a table with with 1 million records in it. Here is the definition
> >
> > CREATE TABLE report
> > (
> > match int4,
> > action varchar(16),
> > stamp timestamptz,
> > account varchar(32),
> > ipaddress inet,
> > profile varchar(16),
> > rating text,
> > url text
> > )
> > WITHOUT OIDS;
> >
> > The is one index:
> >
> > CREATE INDEX stamp_idx
> > ON report
> > USING btree
> > (stamp);
> >
> > That query I'm running is:
> >
> > SELECT date_part('hour'::text, report.stamp) AS "hour", count(*) AS count
> > FROM report
> > GROUP BY date_part('hour'::text, report.stamp)
> > ORDER BY date_part('hour'::text, report.stamp);
>
> You will always get a sequential scan with this query - there is no
> other way to count the rows.
>
> With PostgreSQL being MVCC based, you can't know whether a row is
> visible to you without checking it - visiting the index won't help. Even
> if it could, you'd still have to visit every row in the index.
>
> Assuming the table is a log, with always increasing timestamps, I'd
> create a summary table and query that.

Yea, actually it a proxy server log each month the databasae is 500k records. I
have two months loaded only to put some stress on the server. Some ever month
I'm loading the data just so I can do some analysis. The optimization question
came up when one of the other database folks wanted to play with the database in
MS-SQL server.

How can I add a column that respresents a function that returns just the
date_part? I wondering if that will increase the speed of the query in similar
fashion as the MS-SQL did.

I hadn't though about the MVCC vs. file locking issue. The MS-SQL server does
not have any load on it and I'm sure if other users were hitting it the same
table with the same query, PG would be perform better.

--
Keith C. Perry, MS E.E.
Director of Networks & Applications
VCSN, Inc.
http://vcsn.com

____________________________________
This email account is being host by:
VCSN, Inc : http://vcsn.com

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Doseok Kim 2004-07-07 18:37:37 (no subject)
Previous Message Anton Nikiforov 2004-07-07 18:12:49 Re: Getting user who fired a trigger