Re: advice on how to store variable attributes

From: David Johnston <polobo(at)yahoo(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Linos <info(at)linos(dot)es>, "pgsql-sql(at)postgresql(dot)org" <pgsql-sql(at)postgresql(dot)org>
Subject: Re: advice on how to store variable attributes
Date: 2011-10-22 14:28:48
Message-ID: DABCF799-0E42-49AB-9650-66838B764C34@yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

On Oct 22, 2011, at 10:07, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
> 2011/10/22 David Johnston <polobo(at)yahoo(dot)com>:
>> On Oct 22, 2011, at 6:41, Linos <info(at)linos(dot)es> wrote:
>>
>>> Hi all,
>>> i need a little of advice on what could be the best way to store this information.
>>>
>>> We need to calculate the difference in costs for our operations, we are already
>>> storing our vendor invoices in the database so calculate the monetary change it
>>> is a no-brainer but we need to store special attributes for any of the invoices
>>> that we need to compare too, for example:
>>> -electric provider: total Kw.
>>> -water provider: total m3.
>>> -car maintenance: kilometers of the car.
>>> -mobile phones provider: international call minutes, national minutes, number
>>> of sms, etc..
>>>
>>> And much more types/variables, the number of variables can change, not every day
>>> but still can change, i would like that they can be defined/changed from our
>>> application, so alter table to add columns don't seem the best way (still an
>>> option though). We will have "generic" reports that will show us changes in
>>> costs and specific reports for the types with "extended attributes" that we want
>>> to compare.
>>>
>>> To compare values from this "extended attributes" i think we have two ways:
>>> 1- have them in columns and use standard SQL.
>>> 2- create the columns with a function that reads this attrs and create the columns.
>>>
>>> So far i thin we have this options:
>>> 1- a bunch of columns that would be null except when the type of the invoice
>>> uses them.
>>> 2- a table related with the vendor invoices table for every type of invoice
>>> with his specifics columns.
>>> 3- a key/value in a separate table related with the vendor invoices table where
>>> i store the extended attrs of every invoice that needs them.
>>> 4- use a hstore column in the vendor invoces table to store this attrs.
>>>
>>> The first two have the problem of probably changes to the number of attributes
>>> of every type and give a more closed solution, apart from that 1- seems to be a
>>> bit awkward and 2- would need the application that creates the query to know
>>> with what table should join for every type (other point we will need to change
>>> if we want to create new invoices types).
>>>
>>> The last two have his own problems too, with 3 i will need to create a function
>>> that return rows as columns to compare them, with 4- given that i will store the
>>> attrs of every type in the database anyway i can use the operator -> (with a
>>> CASE using operator ? returning 0 if the searched attr it is not in the hstore)
>>> but still don't seem a clean solution for me.
>>>
>>> For me it seems i am missing something, probably any of you have a much more
>>> elegant (or correct) way to handle this situation, what would be your advice?
>>> Thanks.
>>>
>>>
>>
>> Create a table with a single numeric column and multiple category columns.
>>
>> ( amount_value, amount_unit, amount_category, vendor_id )
>
>
> This is EAV model - is good for smaller datasets, for larger datasets
> is problematic. There is second possibility - using a "hstore" contrib
> module - that emulates HASH table - It has better for larger datasets.
>
> Regards
>
> Pavel Stehule

Store was mentioned by the OP.

Any suggestions on where the line between small and large is drawn?

Partitions could help in the larger cases.

My personal first choice is to use separate tables. If going EAV route plan on eventually moving to the separate table route and at least try to make migration relatively easy.

Since both models capture the same data the decision at least partially rests upon the immediate use-cases for the data. Lacking use-cases and size parameters recommendations are difficult to make.

I have not used hstore so my ability to recommend it is limited. My main "concern" is the fact that the data is less visible and always stored as text. I see hstore as being useful for situations where the attributes are variable but this use case seems to have fairly well-defined attributes.

David J.

>
>>
>> If necessary each "amount_value" data type should have it's own table since the processing logic will vary (I.e., you cannot subtract text or Boolean values).
>>
>> You are , in effect, creating multiple tables but combining them into one and using the category column to distinguish between them.
>>
>> David J.
>> --
>> Sent via pgsql-sql mailing list (pgsql-sql(at)postgresql(dot)org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-sql
>>
>
> --
> Sent via pgsql-sql mailing list (pgsql-sql(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-sql

In response to

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message Pavel Stehule 2011-10-22 14:33:17 Re: advice on how to store variable attributes
Previous Message Pavel Stehule 2011-10-22 14:07:07 Re: advice on how to store variable attributes