Re: advice on how to store variable attributes

From: David Johnston <polobo(at)yahoo(dot)com>
To: Linos <info(at)linos(dot)es>
Cc: "pgsql-sql(at)postgresql(dot)org" <pgsql-sql(at)postgresql(dot)org>
Subject: Re: advice on how to store variable attributes
Date: 2011-10-22 16:10:48
Message-ID: 8B9FEA8A-D98D-4B5C-B957-2949A05DF9E6@yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

On Oct 22, 2011, at 11:39, Linos <info(at)linos(dot)es> wrote:

> El 22/10/11 14:53, David Johnston escribió:
>> On Oct 22, 2011, at 6:41, Linos <info(at)linos(dot)es> wrote:
>>
>>> Hi all,
>>> i need a little of advice on what could be the best way to store this information.
>>>
>>> We need to calculate the difference in costs for our operations, we are already
>>> storing our vendor invoices in the database so calculate the monetary change it
>>> is a no-brainer but we need to store special attributes for any of the invoices
>>> that we need to compare too, for example:
>>> -electric provider: total Kw.
>>> -water provider: total m3.
>>> -car maintenance: kilometers of the car.
>>> -mobile phones provider: international call minutes, national minutes, number
>>> of sms, etc..
>>>
>>> And much more types/variables, the number of variables can change, not every day
>>> but still can change, i would like that they can be defined/changed from our
>>> application, so alter table to add columns don't seem the best way (still an
>>> option though). We will have "generic" reports that will show us changes in
>>> costs and specific reports for the types with "extended attributes" that we want
>>> to compare.
>>>
>>> To compare values from this "extended attributes" i think we have two ways:
>>> 1- have them in columns and use standard SQL.
>>> 2- create the columns with a function that reads this attrs and create the columns.
>>>
>>> So far i thin we have this options:
>>> 1- a bunch of columns that would be null except when the type of the invoice
>>> uses them.
>>> 2- a table related with the vendor invoices table for every type of invoice
>>> with his specifics columns.
>>> 3- a key/value in a separate table related with the vendor invoices table where
>>> i store the extended attrs of every invoice that needs them.
>>> 4- use a hstore column in the vendor invoces table to store this attrs.
>>>
>>> The first two have the problem of probably changes to the number of attributes
>>> of every type and give a more closed solution, apart from that 1- seems to be a
>>> bit awkward and 2- would need the application that creates the query to know
>>> with what table should join for every type (other point we will need to change
>>> if we want to create new invoices types).
>>>
>>> The last two have his own problems too, with 3 i will need to create a function
>>> that return rows as columns to compare them, with 4- given that i will store the
>>> attrs of every type in the database anyway i can use the operator -> (with a
>>> CASE using operator ? returning 0 if the searched attr it is not in the hstore)
>>> but still don't seem a clean solution for me.
>>>
>>> For me it seems i am missing something, probably any of you have a much more
>>> elegant (or correct) way to handle this situation, what would be your advice?
>>> Thanks.
>>>
>>>
>>
>> Create a table with a single numeric column and multiple category columns.
>>
>> ( amount_value, amount_unit, amount_category, vendor_id )
>>
>> If necessary each "amount_value" data type should have it's own table since the processing logic will vary (I.e., you cannot subtract text or Boolean values).
>>
>> You are , in effect, creating multiple tables but combining them into one and using the category column to distinguish between them.
>>
>> David J.
>
> Yeah, thanks for the advice David, if i understand you. this is (much better
> explained) my option 3, i have used this format any times for configuration
> tables with great success.
>
> I am not speaking about much data, maybe 200~300 invoices every month so this
> should not be a problem in a long long time, the problem i see with this way of
> doing things it is how should i in the report queries get this values as columns
> to compare change in time? something like this?
>
> SELECT inv.invoice_id,
> inv.total,
> (SELECT amount_value
> FROM vendor_invoices_attrs
> WHERE invoice_id = inv.invoice_id
> AND amount_category = 'international call minutes'),
> (SELECT amount_value
> FROM vendor_invoices_attrs
> WHERE invoice_id = inv.invoice_id
> AND amount_category = 'national call minutes')
> FROM vendor_invoices AS inv
>
> With hstore i think that could be:
> SELECT inv.invoice_id,
> inv.total,
> inv.store -> 'international call minutes',
> inv.store -> 'national call minutes'
> FROM vendor_invoices
>
> For me the last option seems preferable but maybe i am missing a better way to
> get the first result or an added problem of hstore (other than datatypes stored
> that i think i could check for validity in application or database regardless of
> not been integrated in hstore).
>
> Regards,
> Miguel Angel.

SELECT ...
FROM vendor_invoices
JOIN ( SELECT ... FROM eav WHERE category = '' ) icm USING ( invoice_id )

Add a sub-query join for each desired category.

I agree HStore looks cleaner for the query example you give. One apparent advantage is that since each record contains a single value you can use additional columns to store properties (e.g. Units) whereas the hstore makes that more difficult - though not impossible - since the keys are bot naturally related to each other.

From an output perspective hstore looks good. If there are no difficulties on the data input process you might as well just go with it. Given your volume and use-case it really isn't going to matter which option you pick; though I'd suggest providing views related to each major category with only the appropriate values and no access to the hstore column.

David J.

In response to

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message Linos 2011-10-22 17:14:05 Re: advice on how to store variable attributes
Previous Message Scott Swank 2011-10-22 16:06:55 Re: advice on how to store variable attributes