Putting many related fields as an array

From: Ow Mun Heng <Ow(dot)Mun(dot)Heng(at)wdc(dot)com>
To: pgsql-general(at)postgresql(dot)org, WDC - Ow Mun Heng <ow(dot)mun(dot)heng(at)wdc(dot)com>
Subject: Putting many related fields as an array
Date: 2009-05-12 05:23:14
Message-ID: 1242105794.5457.59.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi,

Currently doing some level of aggregrate tables for some data. These
data will be used for slice/dice activity and we want to be able to
play/manipulate the data such that I can get means and stddev data.

Eg: For each Original Column eg:

population_in_town : (I get derivatives)
- mean # of ppl in each town
- stddev # of ppl in each town (stdev calc already uses 2 extra columns
for # of ppl squared and qty of ppl)
- count of ppl
- count of # of ppl is < 100 (to get a percentage of population)
- count of # of ppl is < 500

Hence, I'm seeing a 1:5 column growth here if I put them as column
based.

eg:
| sum of count | sum_of_count_squared | qty | qty < 100 | qty < 500 |

I'm thinking of lumping them into 1 column via an array instead of into
5 different columns. Not sure how to go about this, hence the email to
the list.

something like {244,455,1234,43,23}

query can be done like

sum_of_count / qty = Ave
(sum_of_count_squared * sum_qty ) / (qty * (qty-1)) = STDEV
(sum_qty<100 / sum_qty) = % < 100
(sum_qty<500 / sum_qty) = % < 500

Then there's the issue of speed/responsiveness on doing it.

Help would be appreciated in this.

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Toomas Vendelin 2009-05-12 05:48:38 Re: Pgsql errors, DBI and CGI::Carp
Previous Message Francisco Figueiredo Jr. 2009-05-12 05:09:08 Difference between "in (...)" and "= any(...)" queries when using arrays