Re: optimizing selects on time-series data in Pg

From: Aditya <aditya(at)grot(dot)org>
To: sfpug(at)postgresql(dot)org
Subject: Re: optimizing selects on time-series data in Pg
Date: 2003-08-01 18:25:07
Message-ID: 20030801182507.GA29105@mighty.grot.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: sfpug

On Fri, Aug 01, 2003 at 11:07:41AM -0700, Stephan Szabo wrote:
> > - forced the planner not to use sequential scan and verified that the index
> > scan was in fact more costly then, here's a somewhat edited transcript (only
> > for brevity):
>
> Hmm, does a two column index on timestamp,virtualhost (or possibly the
> other way around) help at all when forcing an index scan? I wouldn't
> guess that it'd help enough, but it might be worth trying.

hm, didn't make a difference (I deleted a bunch of rows this morning to see
what I could get away with, hence the difference in the total row count):

zp1139=> create index zp_log_vhost_timestamp on zp_log (virtualhost, timestamp);
CREATE
zp1139=> select count(*) from zp_log ; count
--------
824276
(1 row)

zp1139=> explain analyze select count(*) from zp_log where virtualhost =
'aditya.bloki.org';
NOTICE: QUERY PLAN:

Aggregate (cost=3968.29..3968.29 rows=1 width=0) (actual time=1974.52..1974.52
rows=1 loops=1)
-> Index Scan using zp_log_vhost on zp_log (cost=0.00..3965.72 rows=1029
width=0) (actual time=54.09..1959.33 rows=5174 loops=1)
Total runtime: 1974.74 msec

EXPLAIN
zp1139=> explain analyze select count(*) from zp_log where virtualhost =
'aditya.bloki.org' and timestamp > (now() - interval '2 days');
NOTICE: QUERY PLAN:

Aggregate (cost=3974.30..3974.30 rows=1 width=0) (actual time=171.99..171.99
rows=1 loops=1)
-> Index Scan using zp_log_vhost on zp_log (cost=0.00..3973.44 rows=343
width=0) (actual time=0.81..168.19 rows=1564 loops=1)
Total runtime: 172.28 msec

EXPLAIN
zp1139=> explain analyze select count(*) from zp_log where virtualhost =
'aditya.bloki.org'; NOTICE: QUERY PLAN:

Aggregate (cost=3968.29..3968.29 rows=1 width=0) (actual time=156.69..156.69
rows=1 loops=1)
-> Index Scan using zp_log_vhost on zp_log (cost=0.00..3965.72 rows=1029
width=0) (actual time=0.59..144.81 rows=5174 loops=1)
Total runtime: 156.91 msec

EXPLAIN
zp1139=> explain analyze select count(*) from zp_log where virtualhost =
'aditya.bloki.org';
NOTICE: QUERY PLAN:

Aggregate (cost=3968.29..3968.29 rows=1 width=0) (actual time=156.58..156.58
rows=1 loops=1)
-> Index Scan using zp_log_vhost on zp_log (cost=0.00..3965.72 rows=1029
width=0) (actual time=0.61..144.83 rows=5174 loops=1)
Total runtime: 156.79 msec

EXPLAIN
zp1139=> explain analyze select count(*) from zp_log where virtualhost =
'aditya.bloki.org' and timestamp > (now() - interval '2 days');
NOTICE: QUERY PLAN:

Aggregate (cost=3974.30..3974.30 rows=1 width=0) (actual time=159.71..159.71
rows=1 loops=1)
-> Index Scan using zp_log_vhost on zp_log (cost=0.00..3973.44 rows=343
width=0) (actual time=0.61..155.88 rows=1564 loops=1)
Total runtime: 159.92 msec

EXPLAIN
zp1139=> explain analyze select count(*) from zp_log where virtualhost =
'www.bloki.com'; NOTICE: QUERY PLAN:

Aggregate (cost=53369.87..53369.87 rows=1 width=0) (actual
time=37057.75..37057.75 rows=1 loops=1)
-> Seq Scan on zp_log (cost=0.00..52778.45 rows=236567 width=0) (actual
time=20.10..36802.64 rows=43671 loops=1)
Total runtime: 37058.22 msec

EXPLAIN
zp1139=> explain analyze select count(*) from zp_log where virtualhost =
'www.bloki.com' and timestamp > (now() - interval '2 days');
NOTICE: QUERY PLAN:

Aggregate (cost=59157.66..59157.66 rows=1 width=0) (actual
time=37250.10..37250.10 rows=1 loops=1)
-> Seq Scan on zp_log (cost=0.00..58960.52 rows=78856 width=0) (actual
time=38.00..37118.34 rows=23388 loops=1)
Total runtime: 37250.25 msec

EXPLAIN
zp1139=> explain analyze select count(*) from zp_log where virtualhost =
'www.bloki.com' and timestamp > (now() - interval '2 days');
NOTICE: QUERY PLAN:

Aggregate (cost=59157.66..59157.66 rows=1 width=0) (actual
time=41464.19..41464.19 rows=1 loops=1)
-> Seq Scan on zp_log (cost=0.00..58960.52 rows=78856 width=0) (actual
time=102.08..41332.15 rows=23386 loops=1)
Total runtime: 41464.35 msec

EXPLAIN

In response to

Responses

Browse sfpug by date

  From Date Subject
Next Message Stephan Szabo 2003-08-01 18:28:05 Re: optimizing selects on time-series data in Pg
Previous Message Stephan Szabo 2003-08-01 18:07:41 Re: optimizing selects on time-series data in Pg