Re: Table design

From: Sean Davis <sdavis2(at)mail(dot)nih(dot)gov>
To: Guy Rouillier <guyr(at)masergy(dot)com>, Postgre General <pgsql-general(at)postgresql(dot)org>
Subject: Re: Table design
Date: 2005-12-02 15:35:06
Message-ID: BFB5D3DA.13F87%sdavis2@mail.nih.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 12/2/05 10:21 AM, "Guy Rouillier" <guyr(at)masergy(dot)com> wrote:

> Sean Davis wrote:
>> This might be a bit off-topic, but I'm curious what folks would do
>> with this
>> situation:
>>
>> I have about 6 different tab-delimited file types, all of which store
>> similar information (microarray gene expression). However, the files
>> come from different manufacturers, so each has slightly different
>> fields with different meanings. However, there are a few columns
>> that are shared. I may need to add table formats in the future (as
>> we get more manufacturers). I can think of at least three ways to go
>> about storing these data:
>>
>> 1) Create a single table that has as many columns as needed for ALL
>> formats and make manufacturer-specific views, naming columns in the
>> view as appropriate. Then put rules on the view for inserts,
>> updates, etc. This is my first choice, I think, but adding a new
>> manufacturer's format means creating a new view and possibly adding
>> columns; some columns may NULL for large portions of the table.
>>
>> 2) Use postgres inheritance, but even shared columns in our data may
>> have different names depending on the manufacturer, so there may be
>> views involved anyway.
>
> I'm unclear if this is just a naming issue, or if the fields in the
> files have different meaning. If it's just a case that supplier A names
> a field "foo" while supplier B names a field with the same meaning
> "bar", I would think you'd want to coalesce all these incoming files
> into a single table containing columns that have meaning to your
> organization. The effort then just becomes one of mapping incoming
> fields into the proper columns, but the end result would be something
> much more consistent and meaningful to your organization.
>
> If on the other hand all these incoming fields have different meaning
> and you need to keep them stored separately, I would look into option
> (4): just keep a separate table for each supplier, since you said that
> even shared fields may have different meaning; then use a view over all
> the tables to answer any queries across suppliers.

The columns are a mixture of both cases, hence the thought about
inheritance. However, the column names do have meaning for users of the
manufacturer's products, so they need to be present, also. I'm not the
first to tackle this problem, and most use a common table structure for all
the data, column naming aside.

Sean

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Andrew Schmidt 2005-12-02 16:01:27 Re: Slow COUNT
Previous Message Guy Rouillier 2005-12-02 15:21:59 Re: Table design