Re: Small locking bugs in hs

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: pgsql-hackers(at)postgresql(dot)org, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: Small locking bugs in hs
Date: 2010-01-20 11:59:40
Message-ID: 1263988780.4043.2046.camel@ebony
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, 2010-01-20 at 04:47 +0100, Andres Freund wrote:
> On Saturday 16 January 2010 12:32:35 Simon Riggs wrote:
> >
> > No. As mentioned upthread, this is not a bug.
> Could you also mention in a little bit more detail why not?

When a cleanup record arrives without a latestRemovedXid we are forced
to assume that the xid could be as late as latestCompletedXid.
Regrettably we aren't certain which of the xids are still there since it
is possible that earlier xids in KnownAssignedXids are actually FATAL
errors that did not write abort records. So we need to conflict with all
current snapshots whose xmin is less than latestCompletedXid to be safe.
This can cause false positives in our assessment of which vxids
conflict.

By using exclusive lock we prevent new snapshots from being taken while
we work out which snapshots to conflict with. This protects those new
snapshots from also being included in our conflict list.

After the lock is released, we allow snapshots again. It is possible
that we arrive at a snapshot that is identical to one that we just
decided we should conflict with. This a case of false positives, not an
actual problem.

There are two cases: (1) if we were correct in using latestCompletedXid
then that means that all xids in the snapshot lower than that are FATAL
errors, so not xids that ever commit. We can make no visibility errors
if we allow such xids into the snapshot. (2) if we erred on the side of
caution and in fact the latestRemovedXid should have been earlier than
latestCompletedXid then we conflicted with a snapshot needlessly. Taking
another identical snapshot is OK, because the earlier conflicted
snapshot was a false positive.

In either case, a snapshot taken after conflict assessment will still be
valid and non-conflicting even if an identical snapshot that existed
before conflict assessment was assessed as conflicting.

If we allowed concurrent snapshots while we were deciding who to
conflict with we would need to include all concurrent snapshotters in
the conflict list as well. We'd have difficulty in working out exactly
who that was, so it is happier for all concerned if we take an exclusive
lock.

It also means that users waiting for a snapshot is a good thing, since
it is more likely that they will live longer after having waited. So its
not a bug for us to use exclusive lock and is actually desirable.

We could reduce false positives by having the master calculate the exact
xmin each time it issues an XLOG_BTREE_DELETE record. That would
introduce more contention since that happens during btree split
operations, so might be counter productive.

--
Simon Riggs www.2ndQuadrant.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Rob Wultsch 2010-01-20 12:11:03 Re: Patch rev 2: MySQL-ism help patch for psql
Previous Message Andres Freund 2010-01-20 10:48:49 Re: An example of bugs for Hot Standby