Re: advice on how to store variable attributes

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: David Johnston <polobo(at)yahoo(dot)com>
Cc: Linos <info(at)linos(dot)es>, "pgsql-sql(at)postgresql(dot)org" <pgsql-sql(at)postgresql(dot)org>
Subject: Re: advice on how to store variable attributes
Date: 2011-10-22 14:33:17
Message-ID: CAFj8pRBJDpNqsb1PphQT_RCSZ3SGcBFaPP4UtdCZqQ10X7m7vw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

2011/10/22 David Johnston <polobo(at)yahoo(dot)com>:
> On Oct 22, 2011, at 10:07, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
>> 2011/10/22 David Johnston <polobo(at)yahoo(dot)com>:
>>> On Oct 22, 2011, at 6:41, Linos <info(at)linos(dot)es> wrote:
>>>
>>>> Hi all,
>>>>    i need a little of advice on what could be the best way to store this information.
>>>>
>>>> We need to calculate the difference in costs for our operations, we are already
>>>> storing our vendor invoices in the database so calculate the monetary change it
>>>> is a no-brainer but we need to store special attributes for any of the invoices
>>>> that we need to compare too, for example:
>>>>    -electric provider: total Kw.
>>>>    -water provider: total m3.
>>>>    -car maintenance: kilometers of the car.
>>>>    -mobile phones provider: international call minutes, national minutes, number
>>>> of sms, etc..
>>>>
>>>> And much more types/variables, the number of variables can change, not every day
>>>> but still can change, i would like that they can be defined/changed from our
>>>> application, so alter table to add columns don't seem the best way (still an
>>>> option though). We will have "generic" reports that will show us changes in
>>>> costs and specific reports for the types with "extended attributes" that we want
>>>> to compare.
>>>>
>>>> To compare values from this "extended attributes" i think we have two ways:
>>>>    1- have them in columns and use standard SQL.
>>>>    2- create the columns with a function that reads this attrs and create the columns.
>>>>
>>>> So far i thin we have this options:
>>>>    1- a bunch of columns that would be null except when the type of the invoice
>>>> uses them.
>>>>    2- a table related with the vendor invoices table for every type of invoice
>>>> with his specifics columns.
>>>>    3- a key/value in a separate table related with the vendor invoices table where
>>>> i store the extended attrs of every invoice that needs them.
>>>>    4- use a hstore column in the vendor invoces table to store this attrs.
>>>>
>>>> The first two have the problem of probably changes to the number of attributes
>>>> of every type and give a more closed solution, apart from that 1- seems to be a
>>>> bit awkward and 2- would need the application that creates the query to know
>>>> with what table should join for every type (other point we will need to change
>>>> if we want to create new invoices types).
>>>>
>>>> The last two have his own problems too, with 3 i will need to create a function
>>>> that return rows as columns to compare them, with 4- given that i will store the
>>>> attrs of every type in the database anyway i can use the operator -> (with a
>>>> CASE using operator ? returning 0 if the searched attr it is not in the hstore)
>>>> but still don't seem a clean solution for me.
>>>>
>>>> For me it seems i am missing something, probably any of you have a much more
>>>> elegant (or correct) way to handle this situation, what would be your advice?
>>>> Thanks.
>>>>
>>>>
>>>
>>> Create a table with a single numeric column and multiple category columns.
>>>
>>> ( amount_value, amount_unit, amount_category, vendor_id )
>>
>>
>> This is EAV model - is good for smaller datasets, for larger datasets
>> is problematic. There is second possibility - using a "hstore" contrib
>> module - that emulates HASH table - It has better for larger datasets.
>>
>> Regards
>>
>> Pavel Stehule
>
> Store was mentioned by the OP.
>
> Any suggestions on where the line between small and large is drawn?
>
> Partitions could help in the larger cases.
>
> My personal first choice is to use separate tables.  If going EAV route plan on eventually moving to the separate table route and at least try to make migration relatively easy.
>
> Since both models capture the same data the decision at least partially rests upon the immediate use-cases for the data.  Lacking use-cases and size parameters recommendations are difficult to make.
>
> I have not used hstore so my ability to recommend it is limited.  My main "concern" is the fact that the data is less visible and always stored as text.  I see hstore as being useful for situations where the attributes are variable but this use case seems to have fairly well-defined attributes.
>

ook - it is true

Pavel

> David J.
>
>
>>
>>>
>>> If necessary each "amount_value" data type should have it's own table since the processing logic will vary (I.e., you cannot subtract text or Boolean values).
>>>
>>> You are , in effect, creating multiple tables but combining them into one and using the category column to distinguish between them.
>>>
>>> David J.
>>> --
>>> Sent via pgsql-sql mailing list (pgsql-sql(at)postgresql(dot)org)
>>> To make changes to your subscription:
>>> http://www.postgresql.org/mailpref/pgsql-sql
>>>
>>
>> --
>> Sent via pgsql-sql mailing list (pgsql-sql(at)postgresql(dot)org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-sql
>

In response to

Browse pgsql-sql by date

  From Date Subject
Next Message Linos 2011-10-22 15:39:20 Re: advice on how to store variable attributes
Previous Message David Johnston 2011-10-22 14:28:48 Re: advice on how to store variable attributes