optimizing selects on times-eries data in Pg

From: Aditya <aditya(at)grot(dot)org>
To: sfpug(at)postgresql(dot)org
Subject: optimizing selects on times-eries data in Pg
Date: 2003-08-01 00:52:32
Message-ID: 20030801005232.GA21352@mighty.grot.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: sfpug

does anyone have pointers to optimizing selects on a table with logfile data?
it's really a time-series -- webserver access logs. I keep data for about a
week and there are around a million rows and selects are done on virtualhost
basis (and there is an index on virtualhost) however as the number of rows
increases, a sequential scan is less costly than an index scan (at least
according to the output of explain analyze)...

I've searched/googled and I don't see any "hints" on how to make this work
much better -- I might have to be content on storing 2 days worth of data at a
time to make the selects work in a reasonable amount of time...

Hints, pointers welcome,
Thanks,
Adi

The table looks like and rows are inserted in real-time (I delete all rows
older than 7 days nightly followed by a full vaccuum):

zp1139=> \d zp_log
Table "zp_log"
Column | Type | Modifiers
-------------+--------------------------+-----------
priority | character varying(15) |
category | character varying(30) |
thread | character varying(30) |
message | character varying(255) |
timestamp | timestamp with time zone |
remoteip | text | not null
remotehost | text |
referer | text |
status | integer | not null
port | integer | not null
virtualhost | text | not null
query | text |
bytes | integer | not null
useragent | text |
Indexes: zp_log_timestamp,
zp_log_vhost

zp1139=> \d zp_log_timestamp
Index "zp_log_timestamp"
Column | Type
-----------+--------------------------
timestamp | timestamp with time zone
btree

zp1139=> \d zp_log_vhost
Index "zp_log_vhost"
Column | Type
-------------+------
virtualhost | text
btree

Responses

Browse sfpug by date

  From Date Subject
Next Message Josh Berkus 2003-08-01 16:58:24 Re: optimizing selects on times-eries data in Pg
Previous Message LN Cisneros 2003-07-30 19:36:32 Re: Next Meeting -- Buca di Beppo Wed. Aug. 6