Re: Request for advice: Table design

From: Guy Fraser <guy(at)incentre(dot)net>
To:
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Request for advice: Table design
Date: 2003-06-17 19:45:06
Message-ID: 3EEF6FC2.3020707@incentre.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-sql

Hi

As an additional note;

Older data is moved into a seperate table to reduce the number of
records that require regular vacuuming. Since the tables would contain
similar data it is simple to use union selections in a view with an
additional column to indicate which table the data comes from. Using a
view that combines the data from the two tables using a union, the data
will appear to be comming from a single table. This method make archival
access transparent.

I have a realtime data collection system that I built. The data is put
into tables on a yearly and monthly basis on the fly and new tables are
created as needed. I use a union to join tables to access the data over
several months. I just thought of a new idea, I am going to write a
function to join the tables required over a timespan - but that's
another story.

Two tables are easy to join with a union :
{if the column types are exactly matched}

create view all_data
select *,'current_data'::text as data_table from current_data ...
union
select *,'archive_data'::text from archive_data ...
;

The last column will indicate the data's origin.

Now to see all the data :

select * from all_data ;

Thats about it, using this method allows the "dynamic" table to small
for quick maintenace and operation, while the "static" table needs less
maintenace so it can be large with out the penalties incurred by
frequent maintenace.

Guy

Andrew Perrin wrote:

>I agree with Jason. There's no reason to use several tables for data that
>are essentially the same in character; that's the point of a relational
>database to begin with! I also agree about the orderhistory table: use it
>to store cumulative updates, linked one-to-many with the orders table, so
>you can get the complete history of an order as well as the latest status
>very easily.
>
>Best,
>Andy Perrin
>
>----------------------------------------------------------------------
>Andrew J Perrin - http://www.unc.edu/~aperrin
>Assistant Professor of Sociology, U of North Carolina, Chapel Hill
>clists(at)perrin(dot)socsci(dot)unc(dot)edu * andrew_perrin (at) unc.edu
>
>
>On Tue, 17 Jun 2003, Jason Godden wrote:
>
>
>>Use one table - it's easier to maintain, more relationally correct and a
>>growth of 150K-200K rows a year should be nothing to PostgreSQL.
>>
>>2 million rows isn't very many or difficult for PostgreSQL to manage so long
>>as it is correctly indexed and vacuumed following any really large bulk data
>>changes. The speed of PostgreSQL won't be an issue either providing you have
>>allocated enough memory in your pg_hba.conf and you are running on a
>>reasonable server (ie - at leats 512mb of ram)
>>
>>You could actually consider three tables though:
>>
>>CREATE TABLE orders (
>> orderid SERIAL8 NOT NULL PRIMARY KEY,
>> ...
>> <misc order record stuff>
>>) WITHOUT OIDS;
>>
>>CREATE TABLE orderstatuses (
>> orderstatus VARCHAR(20) NOT NULL PRIMARY KEY,
>> statusdesc VARCHAR(200) NOT NULL DEFAULT 'New Status',
>>) WITHOUT OIDS;
>>
>>CREATE TABLE orderhistory (
>> orderid INT8 NOT NULL REFERENCES orders (orderid) ON UPDATE CASCADE ON
>>DELETE CASCADE,
>> historyuser VARCHAR(20) NOT NULL DEFAULT current_user,
>> historywhen TIMESTAMP NOT NULL DEFAULT current_timestamp,
>> historystatus VARCHAR(20) NOT NULL REFERENCES orderstatuses (orderstatus) ON
>>UPDATE CASCADE ON DELETE RESTRICT,
>> CONSTRAINT orderhistory_pkey PRIMARY KEY (orderid,historywhen)
>>) WITHOUT OIDS;
>>
>>... and maybe create a non-unique index on orderid in orderhistory.
>>
>>This way you can always get the status of an order by selecting the most
>>recent entry in the orderhistory table pertaining to a particular record and
>>use a single plpgsql function add/modify data. You could also use a rewrite
>>rule on a view displaying the most recent record. And now you have an order
>>history defining an order's state at a point in time.
>>
>>Your style may differ to mine in terms of id's etc.. I know storing a whole
>>bunch of varchar's in the orderhistory table and referencing orderstatuses
>>may be a bit funny but I've always been of the opinion that it's best to use
>>a key that actually describes the data.
>>
>>Rgds,
>>
>>Jason
>>
>>On Tue, 17 Jun 2003 06:35 pm, Mattias Kregert wrote:
>>
>>>Good morning, list!
>>>I am in the middle of a project and i am just about to decide how to lay
>>>out the table(s) for work orders. I had been planning to use one table, but
>>>now i'm not so sure that it would be optimal.
>>>
>>>The order history will grow with something like 150K-200K rows/year and
>>>will have to be saved for ten years. The total number of active
>>>(new+validated+planned+ready_for_invoicing) orders would be something like
>>>5K rows.
>>>
>>>I am thinking about two solutions:
>>>
>>>1. One table, "orders" with a column named "status". New orders, validated
>>>orders, planned orders, ready for invoicing, and old orders, all in one big
>>>table. The status column would be updated a number of times for each order.
>>>
>>>2. A number of tables, "new_orders", "validated_orders", "order_history"
>>>etc... No status column. Order rows would be moved from one table to
>>>another. Perhaps i should have only two tables: "orders" and
>>>"order_history"?
>>>
>>>
>>>More tables would mean more indexes (but smaller in size), harder to
>>>use/maintain a lot of tables and indexes and triggers and stuff... One
>>>table would mean that a lot of inactive orders would slow down access to
>>>active rows in the table...
>>>
>>>Any suggestions? Real world examples?
>>>
>>>/M
>>>
>>>
>>>---------------------------(end of broadcast)---------------------------
>>>TIP 1: subscribe and unsubscribe commands go to majordomo(at)postgresql(dot)org
>>>
>>
>>---------------------------(end of broadcast)---------------------------
>>TIP 7: don't forget to increase your free space map settings
>>
>>
>
>---------------------------(end of broadcast)---------------------------
>TIP 5: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/docs/faqs/FAQ.html
>
>
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Guillaume LELARGE 2003-06-17 20:05:56 Re: Bad link on techdocs
Previous Message Ron Johnson 2003-06-17 19:41:03 Re: postgreSQL on NAS/SAN?

Browse pgsql-sql by date

  From Date Subject
Next Message Jonathan Gardner 2003-06-17 20:34:23 Re: [ADMIN] Notification
Previous Message Eric Anderson Vianet SAO 2003-06-17 19:31:04 Re: yet pg_toast reindex