Re: Serialization errors on single threaded request

From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: Serialization errors on single threaded request
Date: 2005-08-26 18:27:55
Message-ID: s30f18df.021@gwmta.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Unfortunately, the original test environment has been blown away in favor of testing the 8.1 beta release. I can confirm that the problem exists on a build of the 8.1 beta. If it would be helpful I could set it up again on 8.0.3 to confirm. I THINK it was actually the tip of the 8.0 stable branch as opposed to the 8.0.3 release proper.

We have a little more information about the failure pattern -- when we get these, it is always after there has been a rollback on the thread which eventually generates the serialization error. So I think the pattern is:

ConnectionA:
- A series of insert/update/deletes (on tables OTHER than the progress table).
- Update the progress table.
- Commit the transaction.
ConnectionB:
- A series of insert/update/deletes (on tables OTHER than the progress table) fails.
- Rollback the transaction.
- Attempt each insert/update/delete individually. Commit or rollback each as we go.
- Attempt to update the progress table -- fail on serialization error.

To avoid any ambiguity in my former posts -- introducing even a very small delay between the operations on ConnectionA and ConnectionB makes the serialization error very infrequent; introducing a larger delay seems to make it go away. I hate to consider that as a solution, however.

I'm afraid I'm not familiar with a good way to capture the stream of communications with the database server. If you could point me in the right direction, I'll give it my best shot.

I did just have a thought, though -- is there any chance that the JDBC Connection.commit is returning once the command is written to the TCP buffer, and I'm getting hurt by some network latency issues -- the Nagle algorithm or some such? (I assume that the driver is waiting for a response from the server before returning, so this shouldn't be the issue.) At the point that the commit confirmation is sent by the server, I assume the shared memory changes are visible to the other processes?

-Kevin


>>> Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> 08/26/05 12:16 PM >>>
"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov> writes:
> What happens if the timestamp of the commit is an exact match for the
> timestamp of the next transaction start? What is the resolution of
> the time sampling?

It's not done via timestamps: rather, each transaction takes a census
of the transaction XIDs that are running in other backends when it
starts (there is an array in shared memory that lets it get this
information cheaply). Reliability of the system clock is not a factor.

Are you sure the server is 8.0.3? There was a bug in prior releases
that might possibly be related:

2005-05-07 17:22 tgl

* src/backend/utils/time/: tqual.c (REL7_3_STABLE), tqual.c
(REL7_4_STABLE), tqual.c (REL7_2_STABLE), tqual.c (REL8_0_STABLE),
tqual.c: Adjust time qual checking code so that we always check
TransactionIdIsInProgress before we check commit/abort status.
Formerly this was done in some paths but not all, with the result
that a transaction might be considered committed for some purposes
before it became committed for others. Per example found by Jan
Wieck.

My recollection though is that this only affected applications that were
using SELECT FOR UPDATE. In any case, it's pretty hard to see how this
would affect an application that is in fact waiting for the backend to
report commit-done before it launches the next transaction; the
race-condition window we were concerned about no longer exists by the
time the backend sends CommandComplete. So my suspicion remains fixed
on that point. Do you have any way of sniffing the network traffic of
the middle-tier to confirm that it's doing what it's supposed to?

regards, tom lane

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Steve Wormley 2005-08-26 18:50:40 BUG #1851: Performance reduction from 8.0.3
Previous Message Tom Lane 2005-08-26 17:16:05 Re: Serialization errors on single threaded request stream