Skip site navigation (1) Skip section navigation (2)

Re: Table Clustering & Time Range Queries

From: Greg Smith <gsmith(at)gregsmith(dot)com>
To: Kevin Buckham <kbuckham(at)applocation(dot)net>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Table Clustering & Time Range Queries
Date: 2009-10-25 00:16:46
Message-ID: alpine.GSO.2.01.0910242007120.27172@westnet.com (view raw or flat)
Thread:
Lists: pgsql-performance
I'm surprised clustering as your main optimization has scaled up for you 
as long as it has, I normally see that approach fall apart once you're 
past a few hundred GB of data.  You're putting a lot of work into a 
technique that only is useful for smaller data sets than you have now. 
There are two basic approaches to optimizing queries against large 
archives of time-series data that do scale up when you can use them:

1) Partition the tables downward until you reach a time scale where the 
working set fits in RAM.

2) Create materialized views that roll up the data needed for the most 
common reports people need run in real-time.  Optimize when those run to 
keep overhead reasonable (which sounds possible given your comments about 
regular maintenance windows).  Switch the app over to running against the 
materialized versions of any data it's possible to do so on.  The two 
standard intros to this topic are at 
http://tech.jonathangardner.net/wiki/PostgreSQL/Materialized_Views and 
http://www.pgcon.org/2008/schedule/events/69.en.html

From what you've said about your app, I'd expect both of these would be 
worth considering.

--
* Greg Smith gsmith(at)gregsmith(dot)com http://www.gregsmith.com Baltimore, MD

In response to

pgsql-performance by date

Next:From: decibelDate: 2009-10-25 20:43:56
Subject: Re: Domain vs table
Previous:From: Jesper KroghDate: 2009-10-24 04:06:11
Subject: Re: Full text search - query plan? PG 8.4.1

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group