Skip site navigation (1) Skip section navigation (2)

Timestamp-based indexing

From: "Harmon S(dot) Nine" <hnine(at)netarx(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Timestamp-based indexing
Date: 2004-07-26 14:49:26
Message-ID: 410519F6.5020806@netarx.com (view raw or flat)
Thread:
Lists: pgsql-performance
Hello --

To increase query (i.e. select) performance, we're trying to get 
postgres to use an index based on a timestamp column in a given table.

Event-based data is put into this table several times a minute, with the 
timestamp indicating when a particular row was placed in the table.

The table is purged daily, retaining only the rows that are less than 7 
days old.  That is, any row within the table is less than 1 week old (+ 
1 day, since the purge is daily).

A typical number of rows in the table is around 400,000.

A "VACUUM FULL ANALYZE"  is performed every 3 hours.


The problem:
We often query the table to extract those rows that are, say, 10 minutes 
old or less.

Given there are 10080 minutes per week, the planner could, properly 
configured, estimate the number of rows returned by such a query to be:

10 min/ 10080 min  *  400,000 = 0.001 * 400,000 = 400.

Making an index scan, with the timestamp field the index, far faster 
then a sequential scan.


However, we can't get the planner to do an timestamp-based index scan.

Anyone know what to do?


Here's the table specs:

monitor=# \d "eventtable"
                                        Table "public.eventtable"
  Column   |            Type             |                          
Modifiers
-----------+-----------------------------+--------------------------------------------------------------
 timestamp | timestamp without time zone | not null default 
('now'::text)::timestamp(6) with time zone
 key       | bigint                      | not null default 
nextval('public."eventtable_key_seq"'::text)
 propagate | boolean                     |
 facility  | character(10)               |
 priority  | character(10)               |
 host      | character varying(128)      | not null
 message   | text                        | not null
Indexes:
    "eventtable_pkey" primary key, btree ("timestamp", "key")
    "eventtable_host" btree (host)
    "eventtable_timestamp" btree ("timestamp")


Here's a query (with "explain analyze"):

monitor=# explain analyze select * from "eventtable" where timestamp > 
CURRENT_TIMESTAMP - INTERVAL '10 minutes';
                                                         QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------
 Seq Scan on "eventtable"  (cost=0.00..19009.97 rows=136444 width=155) 
(actual time=11071.073..11432.522 rows=821 loops=1)
   Filter: (("timestamp")::timestamp with time zone > 
(('now'::text)::timestamp(6) with time zone - '@ 10 mins'::interval))
 Total runtime: 11433.384 ms
(3 rows)


Here's something strange.  We try to disable sequential scans, but to no 
avail.  The estimated cost skyrockets, though:

monitor=# set enable_seqscan = false;
SET
monitor=# explain analyze select * from "eventtable" where timestamp > 
CURRENT_TIMESTAMP - INTERVAL '10 minutes';
                                                             QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------
 Seq Scan on "eventtable"  (cost=100000000.00..100019009.97 rows=136444 
width=155) (actual time=9909.847..9932.438 rows=1763 loops=1)
   Filter: (("timestamp")::timestamp with time zone > 
(('now'::text)::timestamp(6) with time zone - '@ 10 mins'::interval))
 Total runtime: 9934.353 ms
(3 rows)

monitor=# set enable_seqscan = true;
SET
monitor=#



Any help is greatly appreciated :)

-- Harmon




Responses

pgsql-performance by date

Next:From: Merlin MoncureDate: 2004-07-26 14:56:54
Subject: Re: arrays and indexes
Previous:From: Gaetano MendolaDate: 2004-07-26 14:20:15
Subject: Re: Insert are going slower ...

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group