Re: foreign key locks, 2nd attempt

From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: foreign key locks, 2nd attempt
Date: 2011-12-13 21:36:21
Message-ID: 1323808901-sup-7243@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


Excerpts from Noah Misch's message of dom dic 04 09:20:27 -0300 2011:

> > + /*
> > + * If the tuple we're updating is locked, we need to preserve this in the
> > + * new tuple's Xmax as well as in the old tuple. Prepare the new xmax
> > + * value for these uses.
> > + *
> > + * Note there cannot be an xmax to save if we're changing key columns; in
> > + * this case, the wait above should have only returned when the locking
> > + * transactions finished.
> > + */
> > + if (TransactionIdIsValid(keep_xmax))
> > + {
> > + if (keep_xmax_multi)
> > + {
> > + keep_xmax_old = MultiXactIdExpand(keep_xmax,
> > + xid, MultiXactStatusUpdate);
> > + keep_xmax_infomask = HEAP_XMAX_KEYSHR_LOCK | HEAP_XMAX_IS_MULTI;
>
> Not directly related to this line, but is the HEAP_IS_NOT_UPDATE bit getting
> cleared where needed?

AFAICS it's reset along the rest of the HEAP_LOCK_BITS when the tuple is
modified.

> > @@ -2725,11 +2884,20 @@ l2:
> > oldtup.t_data->t_infomask &= ~(HEAP_XMAX_COMMITTED |
> > HEAP_XMAX_INVALID |
> > HEAP_XMAX_IS_MULTI |
> > - HEAP_IS_LOCKED |
> > + HEAP_LOCK_BITS |
> > HEAP_MOVED);
> > + oldtup.t_data->t_infomask2 &= ~HEAP_UPDATE_KEY_INTACT;
> > HeapTupleClearHotUpdated(&oldtup);
> > /* ... and store info about transaction updating this tuple */
> > - HeapTupleHeaderSetXmax(oldtup.t_data, xid);
> > + if (TransactionIdIsValid(keep_xmax_old))
> > + {
> > + HeapTupleHeaderSetXmax(oldtup.t_data, keep_xmax_old);
> > + oldtup.t_data->t_infomask |= keep_xmax_old_infomask;
> > + }
> > + else
> > + HeapTupleHeaderSetXmax(oldtup.t_data, xid);
> > + if (key_intact)
> > + oldtup.t_data->t_infomask2 |= HEAP_UPDATE_KEY_INTACT;
> > HeapTupleHeaderSetCmax(oldtup.t_data, cid, iscombo);
> > /* temporarily make it look not-updated */
> > oldtup.t_data->t_ctid = oldtup.t_self;
>
> Shortly after this, we release the content lock and go off toasting the tuple
> and finding free space. When we come back, could the old tuple have
> accumulated additional KEY SHARE locks that we need to re-copy?

Yeah, I've been wondering about this: do we have a problem already with
exclusion constraints? I mean, if a concurrent inserter doesn't see the
tuple that we've marked here as deleted while we toast it, it could
result in a violated constraint, right? I haven't built a test case to
prove it.

> > @@ -3231,30 +3462,70 @@ l3:
> > {
> > TransactionId xwait;
> > uint16 infomask;
> > + uint16 infomask2;
> > + bool require_sleep;
> >
> > /* must copy state data before unlocking buffer */
> > xwait = HeapTupleHeaderGetXmax(tuple->t_data);
> > infomask = tuple->t_data->t_infomask;
> > + infomask2 = tuple->t_data->t_infomask2;
> >
> > LockBuffer(*buffer, BUFFER_LOCK_UNLOCK);
> >
> > /*
> > - * If we wish to acquire share lock, and the tuple is already
> > - * share-locked by a multixact that includes any subtransaction of the
> > - * current top transaction, then we effectively hold the desired lock
> > - * already. We *must* succeed without trying to take the tuple lock,
> > - * else we will deadlock against anyone waiting to acquire exclusive
> > - * lock. We don't need to make any state changes in this case.
> > + * If we wish to acquire share or key lock, and the tuple is already
> > + * key or share locked by a multixact that includes any subtransaction
> > + * of the current top transaction, then we effectively hold the desired
> > + * lock already (except if we own key share lock and now desire share
> > + * lock). We *must* succeed without trying to take the tuple lock,
>
> This can now apply to FOR UPDATE as well.
>
> For the first sentence, I suggest the wording "If any subtransaction of the
> current top transaction already holds a stronger lock, we effectively hold the
> desired lock already."

I settled on this:

/*
* If any subtransaction of the current top transaction already holds a
* lock as strong or stronger than what we're requesting, we
* effectively hold the desired lock already. We *must* succeed
* without trying to take the tuple lock, else we will deadlock against
* anyone wanting to acquire a stronger lock.
*/
if (infomask & HEAP_XMAX_IS_MULTI)
{
int i;
int nmembers;
MultiXactMember *members;
MultiXactStatus cutoff = get_mxact_status_for_tuplelock(mode);

nmembers = GetMultiXactIdMembers(xwait, &members);

for (i = 0; i < nmembers; i++)
{
if (TransactionIdIsCurrentTransactionId(members[i].xid))
{
if (members[i].status >= cutoff)
{
if (have_tuple_lock)
UnlockTupleTuplock(relation, tid, mode);

pfree(members);
return HeapTupleMayBeUpdated;
}
}
}

pfree(members);
}

Now, I can't see the reason that we didn't previously consider locks "as
strong as what we're requesting" ... but surely it's the same case?

> > + * else we will deadlock against anyone wanting to acquire a stronger
> > + * lock.
>
> > + *
> > + * FIXME -- we don't do the below currently, but I think we should:
> > + *
> > + * We update the Xmax with a new MultiXactId to include the new lock
> > + * mode in this case.
> > + *
> > + * Note that since we want to alter the Xmax, we need to re-acquire the
> > + * buffer lock. The xmax could have changed in the meantime, so we
> > + * recheck it in that case, but we keep the buffer lock while doing it
> > + * to prevent starvation. The second time around we know we must be
> > + * part of the MultiXactId in any case, which is why we don't need to
> > + * go back to recheck HeapTupleSatisfiesUpdate. Also, after we
> > + * re-acquire lock, the MultiXact is likely to (but not necessarily) be
> > + * the same that we see here, so it should be in multixact's cache and
> > + * thus quick to obtain.
>
> What is the benefit of doing so?

After thinking more about it, I think it's bogus. I've removed it.

> Incidentally, why is this level of xlog detail needed for tuple locks? We
> need an FPI of the page before the lock-related changes start scribbling on
> it, and we need to log any xid, even that of a locker, that could land in the
> heap on disk. But, why do we actually need to replay each lock?

Uhm. I remember thinking that a hot standby replica needed it ...

> > + * slightly incorrect, because lockers whose status did not conflict with ours
> > + * are not even considered and so might have gone away anyway.
> > *
> > * But by the time we finish sleeping, someone else may have changed the Xmax
> > * of the containing tuple, so the caller needs to iterate on us somehow.
> > */
> > void
> > -MultiXactIdWait(MultiXactId multi)
> > +MultiXactIdWait(MultiXactId multi, MultiXactStatus status, int *remaining)
>
> This function should probably move (with a new name) to heapam.c (or maybe
> lmgr.c, in part). It's an abstraction violation to have multixact.c knowing
> about lock conflict tables. multixact.c should be marshalling those two bits
> alongside each xid without any deep knowledge of their meaning.

Interesting thought.

> > /*
> > - * Also truncate MultiXactMember at the previously determined offset.
> > + * FIXME there's a race condition here: somebody might have created a new
> > + * segment after we finished scanning the dir. That scenario would leave
> > + * us with an invalid truncateXid in shared memory, which is not an easy
> > + * situation to get out of. Needs more thought.
>
> Agreed. Not sure.
>
> Broadly, this feels like a lot of code to handle truncating the segments, but
> I don't know how to simplify it.

It is a lot of code. And it took me quite a while to even figure out
how to do it. I don't see any other way to go about it.

> > + xmax = HeapTupleGetUpdateXid(tuple);
> > + if (TransactionIdIsCurrentTransactionId(xmax))
> > + {
> > + if (HeapTupleHeaderGetCmax(tuple) >= snapshot->curcid)
> > + return true; /* deleted after scan started */
> > + else
> > + return false; /* deleted before scan started */
> > + }
> > + if (TransactionIdIsInProgress(xmax))
> > + return true;
> > + if (TransactionIdDidCommit(xmax))
> > + {
> > + SetHintBits(tuple, buffer, HEAP_XMAX_COMMITTED, xmax);
> > + /* updating transaction committed, but when? */
> > + if (XidInMVCCSnapshot(xmax, snapshot))
> > + return true; /* treat as still in progress */
> > + return false;
> > + }
>
> In both HEAP_XMAX_MULTI conditional blocks, you do not set HEAP_XMAX_INVALID
> for an aborted updater. What is the new meaning of HEAP_XMAX_INVALID for
> multixacts? What implications would arise if we instead had it mean that the
> updating xid is aborted? That would allow us to get the mid-term performance
> benefit of the hint bit when the updating xid spills into a multixact, and it
> would reduce code duplication in this function.

Well, HEAP_XMAX_INVALID means the Xmax is not valid, period. If there's
a multi whose updater is aborted, there's still a multi that needs to be
checked in various places, so we cannot set that bit.

> I did not review the other tqual.c changes. Could you summarize how the
> changes to the other functions must differ from the changes to
> HeapTupleSatisfiesMVCC()?

I don't think they should differ in any significant way ... if they do,
it's probably bogus. Only HeapTupleSatisfiesVacuum should differ
significantly, because it's a world on its own.

> > --- a/src/bin/pg_resetxlog/pg_resetxlog.c
> > +++ b/src/bin/pg_resetxlog/pg_resetxlog.c
> > @@ -332,6 +350,11 @@ main(int argc, char *argv[])
> > if (set_mxoff != -1)
> > ControlFile.checkPointCopy.nextMultiOffset = set_mxoff;
> >
> > + /*
> > + if (set_mxfreeze != -1)
> > + ControlFile.checkPointCopy.mxactFreezeXid = set_mxfreeze;
> > + */
> > +
> > if (minXlogTli > ControlFile.checkPointCopy.ThisTimeLineID)
> > ControlFile.checkPointCopy.ThisTimeLineID = minXlogTli;
> >
> > @@ -578,6 +601,10 @@ PrintControlValues(bool guessed)
> > ControlFile.checkPointCopy.nextMulti);
> > printf(_("Latest checkpoint's NextMultiOffset: %u\n"),
> > ControlFile.checkPointCopy.nextMultiOffset);
> > + /*
> > + printf(_("Latest checkpoint's MultiXact freezeXid: %u\n"),
> > + ControlFile.checkPointCopy.mxactFreezeXid);
> > + */
>
> Should these changes be live? They look reasonable at first glance.

Oh, I forgot about these. Yeah, these need to be live, but not in the
exact for they have here; there were some tweaks I needed to do IIRC.

> > /*
> > + * A tuple is only locked (i.e. not updated by its Xmax) if it the Xmax is not
> > + * a multixact and it has either the EXCL_LOCK or KEYSHR_LOCK bits set, or if
> > + * the xmax is a multi that doesn't contain an update.
> > + *
> > + * Beware of multiple evaluation of arguments.
> > + */
> > +#define HeapTupleHeaderInfomaskIsLocked(infomask) \
> > + ((!((infomask) & HEAP_XMAX_IS_MULTI) && \
> > + (infomask) & (HEAP_XMAX_EXCL_LOCK | HEAP_XMAX_KEYSHR_LOCK)) || \
> > + (((infomask) & HEAP_XMAX_IS_MULTI) && ((infomask) & HEAP_XMAX_IS_NOT_UPDATE)))
> > +
> > +#define HeapTupleHeaderIsLocked(tup) \
> > + HeapTupleHeaderInfomaskIsLocked((tup)->t_infomask)
>
> I'm uneasy having a HeapTupleHeaderIsLocked() that returns false when a tuple
> is both updated and KEY SHARE-locked. Perhaps HeapTupleHeaderIsUpdated() with
> the opposite meaning, or HeapTupleHeaderIsOnlyLocked()?

I had the IsOnlyLocked thought too. I will go that route.

(I changed HeapTupleHeaderGetXmax to GetRawXmax, thanks for that
suggestion)

--
Álvaro Herrera <alvherre(at)commandprompt(dot)com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Smith 2011-12-13 21:55:14 Re: Configuration include directory
Previous Message Robert Haas 2011-12-13 21:31:32 Re: WIP: URI connection string support for libpq