From: | <richard(dot)henwood(at)stfc(dot)ac(dot)uk> |
---|---|
To: | <anthony(at)resolution(dot)com> |
Cc: | <pgsql-performance(at)postgresql(dot)org> |
Subject: | Re: Speed / Server |
Date: | 2009-10-07 08:40:39 |
Message-ID: | EB1E7CB92F5B35459E0B926D2A614DB6106D32@EXCHANGE19.fed.cclrc.ac.uk |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
> -----Original Message-----
<snip>
> >
> > The problem is, this next year we're anticipating significant growth,
> > where we may be adding more like 20 million rows per month (roughly
> 15GB
> > of data).
> >
> > A row of data might have:
> > The system identifier (int)
> > Date/Time read (timestamp)
> > Sensor identifier (int)
> > Data Type (int)
> > Data Value (double)
>
> One approach that can sometimes help is to use arrays to pack data.
> Arrays may or may not work for the data you are collecting: they work
> best when you always pull the entire array for analysis and not a
> particular element of the array. Arrays work well because they pack
> more data into index fetches and you get to skip the 20 byte tuple
> header. That said, they are an 'optimization trade off'...you are
> making one type of query fast at the expense of others.
>
I recently used arrays for a 'long and thin' table very like those
described here. The tuple header became increasingly significant in our
case. There are some details in my post:
http://www.nabble.com/optimizing-for-temporal-data-behind-a-view-td25490818.html
As Merlin points out: one considerable side-effect of using arrays
is that it reduces the sort of queries which we could perform -
i.e. querying data is was in an array becomes costly.
So, we needed to make sure our user scenarios were (requirements)
were well understood.
richard
--
Scanned by iCritical.
From | Date | Subject | |
---|---|---|---|
Next Message | Grzegorz Jaśkiewicz | 2009-10-07 13:27:03 | Re: Query plan for NOT IN |
Previous Message | Greg Smith | 2009-10-07 00:48:30 | Re: Speed / Server |