From: | Ow Mun Heng <Ow(dot)Mun(dot)Heng(at)wdc(dot)com> |
---|---|
To: | pgsql-general(at)postgresql(dot)org |
Subject: | Q:Aggregrating Weekly Production Data. How do you do it? |
Date: | 2007-09-18 04:34:25 |
Message-ID: | 1190090065.3793.58.camel@neuromancer.home.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Just wondering how everyone is doing aggregration of production data.
Data which runs in the vicinity of a few million a week.
What are the methods which will effectively provide the
min/max/average/count/stdev of the weekly sample size based on different
products/software mix etc.
and still be able to answer correctly, what's the average of data_1 over
the pass 2 months?
I can't just take the average of an 8 averages of each week)
eg:
wk avg data_1
w1 - 2
w2 - 2
w3 - 2
w4 - 3
w5 - 1
w6 - 2
w7 - 2
w8 - 2
average of past 2 months = ave(w1-w8) which is statistically wrong.
using sum of data_1 per week would work though. Please share your
expertise / experience.
(getting min/max/count isn't much of an issue. Stdev is the main issue I
believe)
One such instance I've read about is..
>From this website : (it references using SQL Server Analysis services
but I think the concept is the same)
http://www.phptr.com/articles/printerfriendly.asp?p=337135&rl=1
1. Calculate sum of square of each sale
2. multiple the result of step 1 by the sales count
3. sum all sales
4. Square the result of step 3
5. Substract the result of step 4 from the result of step 2
6. Multiply the sales count by one less than sales count ("sales_count"
* ("sales_count" - 1))
7. Divide the result of step 5 by the result of step 6
8. Stdev will be the square root of step 7
The results are valid (verified with actual data) but I don't understand
the logic. All the Statistical books I've read marked stdev as sqrt
(sum(x - ave(x))^2 / (n - 1). The formula is very different, hence the
confusion.
From | Date | Subject | |
---|---|---|---|
Next Message | Phoenix Kiula | 2007-09-18 05:00:31 | For index bloat: VACUUM ANALYZE vs REINDEX/CLUSTER |
Previous Message | Ow Mun Heng | 2007-09-18 04:23:04 | Re: New/Custom DataType - Altering definition / seeing definition in pgAdmin3 |