Re: invalid tid errors in latest 7.3.4 stable.

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Stephan Szabo <sszabo(at)megazone(dot)bigpanda(dot)com>
Cc: archeron(at)wavefire(dot)com, Chris Kratz <chris(dot)kratz(at)vistashare(dot)com>, pgsql-hackers(at)postgresql(dot)org, Jan Wieck <JanWieck(at)Yahoo(dot)com>
Subject: Re: invalid tid errors in latest 7.3.4 stable.
Date: 2003-09-30 20:10:04
Message-ID: 21491.1064952604@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Stephan Szabo <sszabo(at)megazone(dot)bigpanda(dot)com> writes:
> On Fri, 26 Sep 2003, Tom Lane wrote:
>> Hmm, that is a good point. It would be cleaner to throw a "can't
>> serialize" failure than have the RI triggers run under a different
>> snapshot. I am not sure if we can implement that behavior easily,
>> though. Can you think of a way to detect whether there's an RI conflict
>> against a later-started transaction?

> Not a complete one yet. :(

After working through the cases it doesn't seem like this would be that
hard to do. Here's my proposal:

1. When running in a READ COMMITTED transaction, the RI triggers simply
need to do SetQuerySnapshot before acting. This will update the snap so
that they can see all rows they need to see; they don't need to fool
with using SnapshotNow. Any triggers or rules fired indirectly as a
result of RI actions can use this same snapshot. No inconsistencies.

2. When running in a SERIALIZABLE transaction, our objective is to use
the transaction's serializable snapshot for triggers and rules fired by
RI actions. Therefore the RI actions should throw "can't serialize"
errors if they would need to affect rows that are not visible under this
snapshot. To see what's involved, break down the problem into cases:

* FK->PK actions (that is, checking a matching PK row exists; we're
trying to do a SELECT FOR UPDATE on the PK table). Four cases:
1. PK row existed before our xact, still exists: no problem
2. PK row existed before our xact, updated: no problem really,
but would be okay to throw serialization error
3. PK row existed before our xact, deleted: must throw error,
but does it matter if we say "serialization error"
or "key not found"?
4. no PK row existed before our xact: really should throw a
"key not found" error even if a row has since been
inserted.

7.3 and CVS tip in fact throw can't-serialize errors in both cases 2 and
3. They get case 4 wrong: they will allow an FK to be inserted even
though the PK row is too young to be visible to our transaction.

I claim that all we need to do here is run the SELECT FOR UPDATE under
the transaction's serializable snapshot. That will give the same
behavior as the existing code for cases 2 and 3, and will give what
I claim is the right behavior (throw error) for case 4.

* PK->FK actions (cascading from a PK update or delete):
1. FK row existed before our xact, still exists: no problem,
just perform the action
2. FK row existed before our xact, updated: must throw serialization
error since we cannot update row in SERIALIZABLE mode
3. FK row existed before our xact, deleted: ideally should throw
serialization error, but would probably be okay to ignore row too
4. FK row inserted during our xact: must throw error, but could
call it either serialization or key-exists.

7.3 throws can't-serialize errors in cases 2 and 3, and a key-exists
error in case 4.

We cannot implement case 4 correctly if we do the scan under our
serialization snapshot (we'll fail to see the row at all). Seems what
we must do is scan using a current snapshot (from a SetQuerySnapshot
taken when the RI trigger is entered), and then throw serialization
error if we find any rows to update or delete that would not be visible
under the transaction's serializable snapshot. If the rows are visible,
it's okay to operate on them with the serializable snapshot as
QuerySnapshot for rules or triggers. Note that this implementation
means that case 3 will not throw errors, because such rows will be
ignored by the scan. I think this is an okay tradeoff for getting the
other cases right.

The Executor API extension that we need to implement this behavior
is to pass in a second "crosscheck" snapshot against which tuples to be
updated/deleted are checked; if they don't pass the crosscheck then
throw serialization error. This only requires minimal mods in
execMain.c. The snapshot-creation code in tqual.c will also need an
extra entry point so that the RI triggers can create a fresh snapshot
even in SERIALIZABLE mode (SetQuerySnapshot won't do this).

Comments?

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Treat 2003-09-30 20:35:58 Re: expanding on syslog help
Previous Message Jan Wieck 2003-09-30 18:41:45 Re: ADD FOREIGN KEY (was Re: [GENERAL] 7.4Beta)