Re: Integrity on large sites

From: Scott Marlowe <smarlowe(at)g2switchworks(dot)com>
To: Naz Gassiep <naz(at)mira(dot)net>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Integrity on large sites
Date: 2007-05-23 17:22:47
Message-ID: 46547867.3090905@g2switchworks.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Naz Gassiep wrote:
> I'm working in a project at the moment that is using MySQL, and people
> keep making assertions like this one:
>
> "*Really* big sites don't ever have referential integrity. Or if the few
> spots they do (like with financial transactions) it's implemented on the
> application level (via, say, optimistic locking), never the database level."
>
> This sounds to me like MySQLish. A large DB working with no RI would
> give me nightmares. Is it really true that large sites turn RI off to
> improve performance, or is that just a MySQL thing where it gets turned
> off just because MySQL allows you to turn it off and improve
> performance? Can you even turn RI off in PostgreSQL? Does Oracle, DB2 or
> MSSQL allow you to turn it off? Am I just being naive in thinking that
> everyone runs their DBs with RI in production?

Someone's been drinking the MySQL 3.23 kool aide.

1: The bigger the amount of data you have to store, the more likely you
are to NEED referential integrity to make sure it's not getting all
messed up. Not just financial data either. What about applications
like trouble ticketing systems? Can you imagine having tickets go
orphan in a system to keep track of issues? What about parts inventory
systems? Hospital medication tracking? Transportation scheduling?
Fantasy Football? All of those systems are likely to need RI to make
sure that the data inside them stays coherent. We don't want to have
two customers thinking they have the same quarterback / taxi /
penicillen dosage / broken network router / water pump etc...

2: Handling RI in the application doesn't scale. If everything you do
requires you to check in the app, lock the whole table to prevent race
conditions, and then commit, you'll never scale to any real number of
users. You can have reliability and performance if you do RI in the
database. You only get to pick one if you're gonna do RI in the
application.

3: Of course you can turn off RI in PostgreSQL. Either remove the FK
triggers or disable them db wide. You can the same thing in Oracle as
well. This is normally done during maintenance windows to allow data
that is known to be coherent to be imported quickly. Doing so while
processing transactions is suicidal.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Alexander Staubo 2007-05-23 17:43:50 Re: Integrity on large sites
Previous Message Donald Laurine 2007-05-23 17:12:09 Performance