Re: schema design question

From: Didier Gasser-Morlay <didiergm(at)ogeane(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: schema design question
Date: 2007-02-24 09:57:29
Message-ID: 20070224105729.76304589@acerdid
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Sat, 24 Feb 2007 00:59:02 -0800
snacktime(at)gmail(dot)com (snacktime) wrote:

> Say you have 8 different data models that are related enough to share
> roughly 70% of the same fields, but the shared fields are not always
> the same. And also within any given model, some fields can be empty.
> The business logic is that data is pulled from all the data models and
> put into a common format that has all the combined fields, and sent
> over the wire as a transaction.
>
....
>
> Would most of you create a separate tabel for each transaction type?

I would personnally have one single table which gives more opportunity
to add some processing types in the future rather than to have to
revisit all scripts, routines etc.... to add a new table.

you must however be careful about possible contention on your table:
PK should be an integer and table should be clustered: you always
append at the end of the table and you can select from that table for
past transactions.

> I'm also curious how others would handle the batch closings. In the
> past I have created separate tables for open transactions and
> transactions that have been captured/settled. When a transaction is
> captured it's moved to a different table instead of just having a
> column to mark it as captured. Normally I would select all the
> transactions to capture, insert them into the captured table, delete
> them from the open transactions table, process the batch, and if the
> batch goes through commit everything. That narrows down the number of
> things that can go wrong after you have submitted the batch. The
> alternative would be to just have a column to mark transactions as
> capture and leave them all in one table. I've always been paranoid
> about doing that because it leaves open the possibility of capturing
> thousands of transactions twice if you have a bug, as opposed to a few
> hundred at most.

I would use a marker field. Moviong all these transactions around seems
like an awful lot of data to move. most certainly if you have a
rollback to do.
IMHO the more data you move around the more fragmented your DB
becomes and the more you have to look after it. (Note I am not talking
about PostGresSQL, I am too much of a beginner for that, but that's my
experience with Sybase, Firebird and Oracle)

You probably have several safeguards: your transactions must share a
common field like a batch number you could have a serate table holding
a status for each batch ( processing, processed, rejected etc...),
significant totals of the batch, totals processed, a timestamp for each
stage etc ....

This also gives the opportunity to put in place a simple monitoring
system (always nice to give your users an insight about what happens
in the system and when)

My personnal rule of thumb is that even if disk space is cheap,
bandwidth and I/O are still at a premium. All my collegues not
really playing by this rule have always produced systems which end up
running like dead cows over time.

The you can create a sweeping mechanism that offloads once in a while
processed data to an historical table used for MI, reporting etc... it
all depends on what you intend to do with the data.

> I spent quite a few years working at payment gateways and am now
> creating an open source platform that does the same thing that your
> normal payment gateway does. It's been a while since I've had the
> chance to look at this problem in a fresh light. Most of the
> processing code at payment gateways is left pretty much untouched once
> it's working, it's not something you go in and refactor every few
> months even if it's not perfect.

Very interesting project :)

>
> Would appreciate any feedback.
>
> Chris
>
> ---------------------------(end of
> broadcast)--------------------------- TIP 2: Don't 'kill -9' the
> postmaster
>

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Matthew Peter 2007-02-24 10:33:02 pgfoundry
Previous Message snacktime 2007-02-24 08:59:02 schema design question