Re: data integrity

From: Rich Shepard <rshepard(at)appl-ecosys(dot)com>
To: pdxpug(at)postgresql(dot)org
Subject: Re: data integrity
Date: 2007-03-30 22:33:26
Message-ID: Pine.LNX.4.64.0703301514230.23519@salmo.appl-ecosys.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pdxpug

On Fri, 30 Mar 2007, Thomas J Keller wrote:

> An example of a data integrity issue is where my web form asks for the PI
> last name and the client gives me both first and last name. Or they put
> their name in the PI name field (PI stands for principle investigator -
> the person who got the grant that we will charge funded), or they put an
> extra digit in what should be an 8-digit string if they are here at OHSU,
> but could be any length if they are from somewhere else. Or they use a
> grant identifier (called an "alias") that is valid but belongs to someone
> else. Stuff like that.

Tom,

OK. Now I understand completely. This issue is dbms-independent; it's a
function of the UI or middleware to validate data input before it's stored
in the database table.

> My biggest complaint about FMP is that it can't do rollbacks. So a simple
> error can lead to catastrophic loss of data. Plus, in the last few years
> they've wanted to charge for upgrades about every 6 months. OHSU used to have
> a site license, but dropped it.

I don't know for what 'FMP' are the initials, but it is probably an
application, not a database management system. At the risk of appearing
condescending (for which I'll apologize in advance because I don't mean to
be), the dbms is a storage facility for data. Period. It can do checking
(for example, with triggers), but the type of data entry validation to which
you refer is best done earlier.

For example, your data entry form for PI last name (and I know what a PI
is from my experiences as university faculty) can test for 1) no spaces or
non-alphabetic characters such as a comma or 2) presence on a list of
acceptable names. The latter is preferred.

Your other concerns look to be database schema related. Looking only at
the information in your first paragraph above, I see three tables: pi,
grant, and pigrant.

The 'pi' table lists all principal investigators, and relevant information
about them. Data entry in a charge form could include a read-only widget
such as a combo box that lets the user select from among only those names
available. That resolves getting two names, a first name, or someone else's
name. Each time you add a new PI, that list is automatically updated. The
PI's last name (if they're all unique) or employee number/SSN is the primary
key.

The 'grant' table has the ID number, granting institution, amount, period,
and other grant-specific information. The grant number is the primary key.
Here, too, a date entry validation function can check to ensure that the
number entered is that of a grant in the database.

The 'pigrant' table is a many-to-many intermediate table with the primary
key being (pi_id, grant_number). This allows one PI to be associated with
multiple grants, and each grant to have multiple PIs. Back in the 1970s, the
Salmon River Long Term Ecological Research Project funded by the National
Science Foundation had five PIs: one each from Idaho State U., Michigan
State U., Brigham Young U., the Philadelphia Academy of Natural Sciences,
and the US Forest Service's Northwest Region. If someone enters a valid
grant number, but that number is not associated with the name in the M-M
pigrant table, it's not accepted as a valid entry.

Does this help clarify and resolve your concerns?

Rich

--
Richard B. Shepard, Ph.D. | The Environmental Permitting
Applied Ecosystem Services, Inc. | Accelerator(TM)
<http://www.appl-ecosys.com> Voice: 503-667-4517 Fax: 503-667-8863

In response to

Browse pdxpug by date

  From Date Subject
Next Message Selena Deckelmann 2007-04-06 18:41:48 Meeting, April 17 - Slony
Previous Message Thomas J Keller 2007-03-30 22:02:38 Re: data integrity