Dynamic table

From: A B <gentosaker(at)gmail(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Dynamic table
Date: 2009-06-16 07:11:20
Message-ID: dbbf25900906160011h32e583fcvbc1f499328de3ed3@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi.
I have a little problem (and a suggestion for a solution) that I
wondered if anyone would care to comment on.

I have a standard table filled with customers (with a unique customer
id, names etc.) and for each customer I need to store some integer
values. The problem is that the number of integer values that should
be stored WILL change over time (it can both increase and decrease).
It will most likely grow from zero to 10-18 and then perhaps add 1 or
remove one value per year but basically be about 10-18 or so.

I must be able to add a new integer value to all customers, and remove
an integer value from all customers Altering the table by adding and
deleting columns would theoretically solve it, but since columns are
not really dropped, just hidden away, and there is a 1600 column
limit on tables as I understand it, this would crash the application
at some time in the future, or at least it will waste a lot of
discspace.

Other things I must be able to do is to alter values for a specific
integer value for one customer or all customers simultaneous. This can
be like "change all value 4 to 8 for the third integer value".
And I must be able to quickly select all the integers for a specific
customer for further processing. There will also be access to single
integer values for a specific customer. It's hard to say how much
compared to "get all integer values" for a specific customer. All
customers will be equally accessed.

As I see it I have these options.

Method A) a "add-only" dynamic table that re-uses columns that are not
used any more.

First create a metatable
CREATE TABLE Columnspec (id integer, columnname varchar(10), name
varchar(30), defaultvalue integer, active boolean default true );
That will keep track of the dynamically added columns.

and a table for all the integer values
CREATE TABLE customerValues (customer_id integer references customer);
Initially it will be empty

and adding a new integer value called "shoe size" with a default value
of 4 would mean to run these two queries.
insert into Columnspec (id,columnname,name,defaultvalue) values (1,
'col1', 'shoe size', 4);
alter table customerValues add column col1 integer default 4;

removing a column would mean an operation like
update Columnspec set active=false where id=5;

then adding a new column would mean to reuse an inactive column, like
the following operations
update Columnspec set active=true, name='hair length', defaultvalue=10
where id=5;
update customerValues set col5=10; -- to give all customers the default value.

One could perhaps alter the default value of the column called 'col5'
in the example above and also alter the names of the columns but that
are just cosmetic.
The only drawback I can see with this rightnow is that if the number
of columns decreases, it will waste space.

Method B)
The EAV way.
With a meta table
CREATE TABLE metadata (id integer primary key, name varchar(30),
defaultvalue integer);

and then the values
CREATE TABLE eav (customer_id references customer, key integer
references metadata on delete cascade , value integer,
unique(customer_id, key) );

With c customers, k integer fields method A requires at least c*(k+1)
integers to store, B requires c*k*3. A factor about 3 sounds not that
much wasted space, but of course it can make a difference.

Method C)
Have a metadata table as usual, and then store the values in an array.
I must admit that I have not looked enough at arrays yet.
The drawback I can see right now will be the access to a specific
value in the array, but I guess PostgreSQL is fast...

Method D)
As method B) but add a view that aggregates the values into a complete
set for each customers if that will speed up anything....

Any comments on the pros/cons with these strategies?

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Albe Laurenz 2009-06-16 07:20:19 Re: How can I interpolate psql variables in function bodies?
Previous Message Eduardo Morras 2009-06-16 07:05:35 Re: Postgres 8.2 Memory Tuning