Re: Small locking bugs in hs

From: Andres Freund <andres(at)anarazel(dot)de>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: Small locking bugs in hs
Date: 2010-01-20 13:13:09
Message-ID: 201001201413.09953.andres@anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wednesday 20 January 2010 12:59:40 Simon Riggs wrote:
> On Wed, 2010-01-20 at 04:47 +0100, Andres Freund wrote:
> > On Saturday 16 January 2010 12:32:35 Simon Riggs wrote:
> > > No. As mentioned upthread, this is not a bug.
> >
> > Could you also mention in a little bit more detail why not?
>
> When a cleanup record arrives without a latestRemovedXid we are forced
> to assume that the xid could be as late as latestCompletedXid.
> Regrettably we aren't certain which of the xids are still there since it
> is possible that earlier xids in KnownAssignedXids are actually FATAL
> errors that did not write abort records. So we need to conflict with all
> current snapshots whose xmin is less than latestCompletedXid to be safe.
> This can cause false positives in our assessment of which vxids
> conflict.
> By using exclusive lock we prevent new snapshots from being taken while
> we work out which snapshots to conflict with. This protects those new
> snapshots from also being included in our conflict list.
>
> After the lock is released, we allow snapshots again. It is possible
> that we arrive at a snapshot that is identical to one that we just
> decided we should conflict with. This a case of false positives, not an
> actual problem.
>
> There are two cases: (1) if we were correct in using latestCompletedXid
> then that means that all xids in the snapshot lower than that are FATAL
> errors, so not xids that ever commit. We can make no visibility errors
> if we allow such xids into the snapshot. (2) if we erred on the side of
> caution and in fact the latestRemovedXid should have been earlier than
> latestCompletedXid then we conflicted with a snapshot needlessly. Taking
> another identical snapshot is OK, because the earlier conflicted
> snapshot was a false positive.
>
> In either case, a snapshot taken after conflict assessment will still be
> valid and non-conflicting even if an identical snapshot that existed
> before conflict assessment was assessed as conflicting.
>
> If we allowed concurrent snapshots while we were deciding who to
> conflict with we would need to include all concurrent snapshotters in
> the conflict list as well. We'd have difficulty in working out exactly
> who that was, so it is happier for all concerned if we take an exclusive
> lock.
>
> It also means that users waiting for a snapshot is a good thing, since
> it is more likely that they will live longer after having waited. So its
> not a bug for us to use exclusive lock and is actually desirable.
>
> We could reduce false positives by having the master calculate the exact
> xmin each time it issues an XLOG_BTREE_DELETE record. That would
> introduce more contention since that happens during btree split
> operations, so might be counter productive.
Wow. Thanks for the extensive explanation!

I do understand it correctly that in CancelVirtualTransaction LW_SHARED is
taken only so that another transaction can finish during that time?

Andres

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2010-01-20 13:15:27 Re: MySQL-ism help patch for psql
Previous Message Leonardo F 2010-01-20 13:00:46 Re: Review: Patch: Allow substring/replace() to get/set bit values