Quick Links

Re: Timestamp-based indexing

From:	"Matthew T(dot) O'Connor" <matthew(at)zeut(dot)net>
To:	"Harmon S(dot) Nine" <hnine(at)netarx(dot)com>
Cc:	pgsql-performance(at)postgresql(dot)org
Subject:	Re: Timestamp-based indexing
Date:	2004-07-26 15:09:53
Message-ID:	41051EC1.3070704@zeut.net
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

VACUUM FULL ANALYZE every 3 hours seems a little severe. You will
probably be be served just as well by VACUUM ANALYZE. But you probably
don't need the VACUUM part most of the time. You might try doing an
ANALYZE on the specific tables you are having issues with. Since
ANALYZE should be much quicker and not have the performance impact of a
VACUUM, you could do it every hour, or even every 15 minutes.

Good luck...

Harmon S. Nine wrote:

> Hello --
>
> To increase query (i.e. select) performance, we're trying to get
> postgres to use an index based on a timestamp column in a given table.
>
> Event-based data is put into this table several times a minute, with
> the timestamp indicating when a particular row was placed in the table.
>
> The table is purged daily, retaining only the rows that are less than
> 7 days old. That is, any row within the table is less than 1 week old
> (+ 1 day, since the purge is daily).
>
> A typical number of rows in the table is around 400,000.
>
> A "VACUUM FULL ANALYZE" is performed every 3 hours.
>
>
> The problem:
> We often query the table to extract those rows that are, say, 10
> minutes old or less.
>
> Given there are 10080 minutes per week, the planner could, properly
> configured, estimate the number of rows returned by such a query to be:
>
> 10 min/ 10080 min * 400,000 = 0.001 * 400,000 = 400.
>
> Making an index scan, with the timestamp field the index, far faster
> then a sequential scan.
>
>
> However, we can't get the planner to do an timestamp-based index scan.
>
> Anyone know what to do?
>
>
> Here's the table specs:
>
> monitor=# \d "eventtable"
> Table "public.eventtable"
> Column | Type |
> Modifiers
> -----------+-----------------------------+--------------------------------------------------------------
>
> timestamp | timestamp without time zone | not null default
> ('now'::text)::timestamp(6) with time zone
> key | bigint | not null default
> nextval('public."eventtable_key_seq"'::text)
> propagate | boolean |
> facility | character(10) |
> priority | character(10) |
> host | character varying(128) | not null
> message | text | not null
> Indexes:
> "eventtable_pkey" primary key, btree ("timestamp", "key")
> "eventtable_host" btree (host)
> "eventtable_timestamp" btree ("timestamp")
>
>
> Here's a query (with "explain analyze"):
>
> monitor=# explain analyze select * from "eventtable" where timestamp >
> CURRENT_TIMESTAMP - INTERVAL '10 minutes';
> QUERY PLAN
> ----------------------------------------------------------------------------------------------------------------------------
>
> Seq Scan on "eventtable" (cost=0.00..19009.97 rows=136444 width=155)
> (actual time=11071.073..11432.522 rows=821 loops=1)
> Filter: (("timestamp")::timestamp with time zone >
> (('now'::text)::timestamp(6) with time zone - '@ 10 mins'::interval))
> Total runtime: 11433.384 ms
> (3 rows)
>
>
> Here's something strange. We try to disable sequential scans, but to
> no avail. The estimated cost skyrockets, though:
>
> monitor=# set enable_seqscan = false;
> SET
> monitor=# explain analyze select * from "eventtable" where timestamp >
> CURRENT_TIMESTAMP - INTERVAL '10 minutes';
> QUERY PLAN
> -------------------------------------------------------------------------------------------------------------------------------------
>
> Seq Scan on "eventtable" (cost=100000000.00..100019009.97 rows=136444
> width=155) (actual time=9909.847..9932.438 rows=1763 loops=1)
> Filter: (("timestamp")::timestamp with time zone >
> (('now'::text)::timestamp(6) with time zone - '@ 10 mins'::interval))
> Total runtime: 9934.353 ms
> (3 rows)
>
> monitor=# set enable_seqscan = true;
> SET
> monitor=#
>
>
>
> Any help is greatly appreciated :)
>
> -- Harmon
>
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo(at)postgresql(dot)org so that your
> message can get through to the mailing list cleanly
>

In response to

Timestamp-based indexing at 2004-07-26 14:49:26 from Harmon S. Nine

Responses

Re: Timestamp-based indexing at 2004-07-26 15:46:42 from Harmon S. Nine
Re: Timestamp-based indexing at 2004-07-26 15:59:11 from Tom Lane

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Harmon S. Nine	2004-07-26 15:42:26	Re: Timestamp-based indexing
Previous Message	Kevin Barnard	2004-07-26 14:58:38	Re: Timestamp-based indexing