WIP fix proposal for bug #6123

From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: <pgsql-hackers(at)postgresql(dot)org>
Subject: WIP fix proposal for bug #6123
Date: 2011-07-20 18:58:04
Message-ID: 4E26DEEC020000250003F5E9@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

We're nearing completion of testing the migration of a lot of code
which used our custom Java framework into PostgreSQL functions and
triggers. Yesterday our testers ran into surprising behavior
related to delete triggers. A test case is presented on the -bugs
list, but basically it amounts to this:

(1) We have some detail which is summarized by triggers into
related higher-level tables for performance reasons.

(2) On delete, some of the higher level tables should cascade the
delete down to the lower levels.

(3) Sometimes the same tables are involved in both.

This is complicated by the foreign key situation -- due to
conversion of less-than-perfect data and the fact that there is a
legal concept of the elected Clerk of Court in each county being the
"custodian of the data" we have orphaned detail in some tables which
we don't have authority to delete or create bogus parent rows for.
(It would actually be a felony for us to do so, I think.) Until 9.2
we won't be able to create foreign keys for these relationships, but
in each county we've created foreign keys for all relationships
without orphans. So, this is one reason we can't count on foreign
key declarations, with the ON DELETE CASCADE option, yet we don't
want to drop the foreign keys where they exist, as they help prevent
further degradation of the data integrity. So the DELETE from the
children should occur in the BEFORE trigger to avoid upsetting FK
logic. Otherwise we could move the cascading deletes to the AFTER
DELETE trigger, where this odd behavior doesn't occur.

So basically, the goal of this patch is to ensure that a BEFORE
DELETE trigger doesn't silently fail to delete a row because that
row was updated during the BEFORE DELETE trigger firing (in our case
by secondary trigger firing).

If that description was too hard to follow, let me know and I'll try
again. :-/

[Summarizing discussion on the -bugs list,] Tom didn't feel that
there was a need to support application code which does what I
describe above, and he felt that fixing it would open a can of
worms, with logical quandaries about correct behavior. Basically,
the changes I made were within switch statements where if the row
was found to be HeapTupleUpdated in READ COMMITTED, it would follow
the ctid chain; I used similar logic for HeapTupleSelfUpdated
regardless of transaction isolation level. The reasons for not
re-firing delete triggers here is the same for why delete triggers
are not fired in the existing case -- it's just one delete.

No claims are made for completeness of this patch -- it's a "proof of
concept" on which I hope to get comments. Before this patch would be
production ready I would need to check for similar needs on UPDATE,
and would need to check to make sure there is no resource leakage.
It passes `make check-world`, `make installcheck-world` with a
database set for serializable transaction isolation, and the
isolation tests. And of course, it doesn't show the behavior which
we found so astonishing -- we no longer see an attempted delete of a
parent succeed in deleting the children but leave the parent sitting
there.

A patch against 9.0 based on this approach may be find its way into
production here in about two weeks if there are no
contra-indications, so any review would be very much appreciated.

-Kevin

Attachment Content-Type Size
delete-before-fix-wip1.patch text/plain 1.6 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2011-07-20 19:23:07 Re: WIP fix proposal for bug #6123
Previous Message Robert Haas 2011-07-20 18:55:34 Re: A few user-level questions on Streaming Replication and pg_upgrade