Re: [HACKERS] Aggregation question

From: Bruno Wolff III <bruno(at)wolff(dot)to>
To: Dilip Angal <dilip_angal(at)yahoo(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, pgsql-general(at)postgresql(dot)org
Subject: Re: [HACKERS] Aggregation question
Date: 2004-02-07 18:09:49
Message-ID: 20040207180949.GA26459@wolff.to
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

On Tue, Feb 03, 2004 at 21:27:48 -0800,
Dilip Angal <dilip_angal(at)yahoo(dot)com> wrote:
> Hi
>
> I have a situation that I need flexible number columns to model the business requirements. It could go up to 1000 columns and will be a sparse matrix. One option I was considering was to have a table with col1......col1000

I don't think this is really a hackers' topic. Probably it belongs on general,
though with the sizes you are talking about performance might be appropiate.

I would think that the natural way to reprensent a sparse matrix would be
to use a three column table with row number, column number, and cell value.
Doing aggregates on a column number will be easy.

If the data isn't really a matrix and the different cells are different
data types, you may be able to store all the values as strings and convert
to the appropiate type as needed.

> Other option I can consider is store all of them as name values in single column as a string
>
> option 1
> col1 | col2 |col 3 |........|col56|.....|col77|
> 10 | 2 | | | 4 | | |
> 2 | | 4 | | |.... |6 |
>
> option 2
>
> "col1=10,col2=2,col56=4"
> "col1=2,col3=4,col77=6"
>
> I will have about 50Million such records in a table.
>
> I need to aggregate the column values for a given day by taking sum of all the rows
>
> "col1=12,col2=2,col3=4,col56=4,col77=6"
>
> Second option looks very elegant but aggregation is hard
> Besides, second option may have performance penalty (Especially when you have 50 Million records)
>
> Any one can give any pointers or comments on how to model this an how to aggregate it?
>
> Thanks
> Dilip
>

In response to

Browse pgsql-general by date

  From Date Subject
Next Message David Garamond 2004-02-07 18:40:22 connectby for BYTEA keys
Previous Message Hans-Jürgen Schönig 2004-02-07 17:27:41 Re: Aggregation question

Browse pgsql-hackers by date

  From Date Subject
Next Message Alex J. Avriette 2004-02-07 18:12:17 RFC: Security documentation
Previous Message Hans-Jürgen Schönig 2004-02-07 17:27:41 Re: Aggregation question