Re: [WIP PATCH] Lazily assign xids for toplevel Transactions

From: "Florian G(dot) Pflug" <fgp(at)phlo(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Postgresql-Hackers <pgsql-hackers(at)postgresql(dot)org>, Simon Riggs <simon(at)2ndquadrant(dot)com>
Subject: Re: [WIP PATCH] Lazily assign xids for toplevel Transactions
Date: 2007-08-25 23:50:53
Message-ID: 46D0C05D.4020405@phlo.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tom Lane wrote:
> "Florian G. Pflug" <fgp(at)phlo(dot)org> writes:
>> I've spent the last few days factoring out that work, and turning it into
>> a general solution. The result is this patch, which basically does the following
>
>> .) It defines a special TemporaryTransactionId that is used as an xact's xid
>> until the xact calls GetCurrentTransactionId / GetTopTransactionId.
>
> [ squint... ] Why do you need that? The behavior we use with
> subtransactions is just to leave the XID as InvalidOid until there's
> a reason to assign it, and I think we should do the same with top-level
> XIDs. Having an extra TemporaryTransactionId seems ugly, mainly because
> it's not clear how XID comparison should handle it.

I invented the TemporaryTransactionId because it seemed worthwhile to
distinguish between backends which do not currently run a transaction
(xid == InvalidTransactionId), and such which run a transaction that
doesn't yet have an xid assiged (xid == TemporaryTransactionId). Currently,
the TemporaryTransactionId is treated to be later than any other xid
value. I'm not wedded to those TemporaryTransactionIds though - they just
seemed like a good idea when I started with my readonly-queries on PITR-slaves
work, and it allows for a few more assertions.

> To leave XID at 0, you will need to teach GetSnapshotData and maybe
> some other places that a proc could be advertising nonzero xmin even
> when its XID is still 0. This does not seem like a big problem though.

Yeah - TemporaryTransactionId removed the need for a few special cases
in that area - but at the cost of having the distinguish between
TransactionIdIsValid and TransactionIdIsPermanent (meaning valid && !temporary).

>> .) Each transaction get an "rid" (ResourceOwnerId) assigned when it starts, and
>> obtains a lock on that rid, similar to how the xid is locked. This can
>> be used to wait for a transaction's toplevel resource owner to release all
>> it's locks, and serves as a unique identifier for a running transaction.
>
> This seems like inventing a concept we could do without, also overhead
> we could do without (assigning globally unique RIDs would require as
> much mechanism and contention as XID assignment does). What about
> locking the backend's PID instead? I do not see that long-term
> uniqueness is needed, we just want to be able to wait for the backend's
> current transaction to end.
>
> If you do think it's important to distinguish the other guy's current
> and next transaction (which maybe it is), then possibly we could lock a
> combination of the PID and a *local*, per-backend transaction counter
> (there should be plenty of room in LOCKTAG for this). This counter
> value would have to be advertised in PGPROC, but there wouldn't be any
> contention involved to assign a new value.

I wanted some (single) value that would fit into some standard C datatype.
Since I guess using "long long" in postgres code code is a bad idea
(It's not supported on all 32-bit plattforms I think), I wanted to come
up with some 32-bit identifier. If the PID were guaranteed to be 16-bit
we could use the other 16 bits as a counter - but all modern Unixen have
moved past a limit of 65535 processes I fear...

While writing this mail I realized that my RID generation algorithm -
while being quite lightweight I think - has a small race condition.
The algorithm is
for(;;) {
rid = ShmemVariableCache->nextRid++ ;

if (ResourceOwnerIdIsValid(rid) && ResourceOwnerLockTableInsert(rid))
break ;
}

I just realized that if two backend manage to obtain the same rid, and
one than is paged out long enough for the other to lock the rid, run
it's transaction and commit, then the second backend will get the same
rid :-(. So it's back to the drawing board anyway...

> It's slightly annoying that this scheme involves taking two lmgr locks
> per read-write transaction. I wonder whether we couldn't dispense with
> the notion of locking one's XID per se. This would mean that where we
> try to wait for another transaction by XID, we have to trawl through
> the ProcArray to find that XID and see what PID/localID it maps to;
> but if we're in that path we're already going to be blocking, so more
> cycles there might be a good tradeoff for fewer cycles in transaction
> start.

Yeah - I do not really like that dual-locking thing either. But it makes
prepared transaction handling much easier - if we were to only lock the
RID, we'd have to store the rid<->xid mapping for prepared transactions
somewhere *and* guarantee that we won't assign that RID to another transaction -
even after a server restart...

>> 1) The second waiting phase of concurrent index builds fail to wait for xacts
>> that haven't been assigned an XID when the reference shapshot was taken.
>> The "rid" doesn't help here, because it's not currently store in the
>> snapshot.
>
> We'd probably want to whack that around so that it pays attention to the
> xmins advertised by other backends, rather than their XIDs. This is
> just a handwave though.

It's even simpler - In that section of the code, the XID is used just as a
unique identifier for currently running transactions - and talking a snapshot
convenientl creates just such a list of XIDs. The XID isn't used in any
comparisions. So I think I'd be sufficient to scan the proc array after
obtaining the reference snapshot, and just manually build a list of RIDs
instead of XIDs. It wouldn't matter that we build the list slightly later than
the snapshot - the only effect of this would be that we might end up waiting
for a transaction unnecessarily.

However, before putting more work into that part of the code, I'll check
the latest version of HOT. I seem to remember that there was this idea
of storing some kind of xmin with an index that could be used to decide
if an index was usable for a certain snapshot.

greetings, Florian Pflug

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2007-08-26 00:18:40 Re: [WIP PATCH] Lazily assign xids for toplevel Transactions
Previous Message Tom Lane 2007-08-25 22:51:53 Re: reindexdb hangs