Re: Request for advice: Table design

From: Andrew Perrin <clists(at)perrin(dot)socsci(dot)unc(dot)edu>
To: Jason Godden <jasongodden(at)optushome(dot)com(dot)au>
Cc: Mattias Kregert <mattias(at)kregert(dot)se>, pgsql-general(at)postgresql(dot)org
Subject: Re: Request for advice: Table design
Date: 2003-06-17 13:37:10
Message-ID: Pine.LNX.4.53.0306170936050.8304@perrin.socsci.unc.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-sql

I agree with Jason. There's no reason to use several tables for data that
are essentially the same in character; that's the point of a relational
database to begin with! I also agree about the orderhistory table: use it
to store cumulative updates, linked one-to-many with the orders table, so
you can get the complete history of an order as well as the latest status
very easily.

Best,
Andy Perrin

----------------------------------------------------------------------
Andrew J Perrin - http://www.unc.edu/~aperrin
Assistant Professor of Sociology, U of North Carolina, Chapel Hill
clists(at)perrin(dot)socsci(dot)unc(dot)edu * andrew_perrin (at) unc.edu

On Tue, 17 Jun 2003, Jason Godden wrote:

> Use one table - it's easier to maintain, more relationally correct and a
> growth of 150K-200K rows a year should be nothing to PostgreSQL.
>
> 2 million rows isn't very many or difficult for PostgreSQL to manage so long
> as it is correctly indexed and vacuumed following any really large bulk data
> changes. The speed of PostgreSQL won't be an issue either providing you have
> allocated enough memory in your pg_hba.conf and you are running on a
> reasonable server (ie - at leats 512mb of ram)
>
> You could actually consider three tables though:
>
> CREATE TABLE orders (
> orderid SERIAL8 NOT NULL PRIMARY KEY,
> ...
> <misc order record stuff>
> ) WITHOUT OIDS;
>
> CREATE TABLE orderstatuses (
> orderstatus VARCHAR(20) NOT NULL PRIMARY KEY,
> statusdesc VARCHAR(200) NOT NULL DEFAULT 'New Status',
> ) WITHOUT OIDS;
>
> CREATE TABLE orderhistory (
> orderid INT8 NOT NULL REFERENCES orders (orderid) ON UPDATE CASCADE ON
> DELETE CASCADE,
> historyuser VARCHAR(20) NOT NULL DEFAULT current_user,
> historywhen TIMESTAMP NOT NULL DEFAULT current_timestamp,
> historystatus VARCHAR(20) NOT NULL REFERENCES orderstatuses (orderstatus) ON
> UPDATE CASCADE ON DELETE RESTRICT,
> CONSTRAINT orderhistory_pkey PRIMARY KEY (orderid,historywhen)
> ) WITHOUT OIDS;
>
> ... and maybe create a non-unique index on orderid in orderhistory.
>
> This way you can always get the status of an order by selecting the most
> recent entry in the orderhistory table pertaining to a particular record and
> use a single plpgsql function add/modify data. You could also use a rewrite
> rule on a view displaying the most recent record. And now you have an order
> history defining an order's state at a point in time.
>
> Your style may differ to mine in terms of id's etc.. I know storing a whole
> bunch of varchar's in the orderhistory table and referencing orderstatuses
> may be a bit funny but I've always been of the opinion that it's best to use
> a key that actually describes the data.
>
> Rgds,
>
> Jason
>
> On Tue, 17 Jun 2003 06:35 pm, Mattias Kregert wrote:
> > Good morning, list!
> > I am in the middle of a project and i am just about to decide how to lay
> > out the table(s) for work orders. I had been planning to use one table, but
> > now i'm not so sure that it would be optimal.
> >
> > The order history will grow with something like 150K-200K rows/year and
> > will have to be saved for ten years. The total number of active
> > (new+validated+planned+ready_for_invoicing) orders would be something like
> > 5K rows.
> >
> > I am thinking about two solutions:
> >
> > 1. One table, "orders" with a column named "status". New orders, validated
> > orders, planned orders, ready for invoicing, and old orders, all in one big
> > table. The status column would be updated a number of times for each order.
> >
> > 2. A number of tables, "new_orders", "validated_orders", "order_history"
> > etc... No status column. Order rows would be moved from one table to
> > another. Perhaps i should have only two tables: "orders" and
> > "order_history"?
> >
> >
> > More tables would mean more indexes (but smaller in size), harder to
> > use/maintain a lot of tables and indexes and triggers and stuff... One
> > table would mean that a lot of inactive orders would slow down access to
> > active rows in the table...
> >
> > Any suggestions? Real world examples?
> >
> > /M
> >
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 1: subscribe and unsubscribe commands go to majordomo(at)postgresql(dot)org
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 7: don't forget to increase your free space map settings
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Mattias Kregert 2003-06-17 13:41:10 Re: Difficulty with quotes
Previous Message Shridhar Daithankar 2003-06-17 13:31:25 Re: full featured alter table?

Browse pgsql-sql by date

  From Date Subject
Next Message Tom Lane 2003-06-17 14:29:07 Re: [HACKERS] Our FLOAT(p) precision does not conform to spec
Previous Message weigelt 2003-06-17 12:40:28 Re: Can the frontend get notifications from the postgres server?