Re: TransactionIdIsInProgress() cache

From: "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>
To: "Simon Riggs" <simon(at)2ndquadrant(dot)com>
Cc: <pgsql-patches(at)postgresql(dot)org>
Subject: Re: TransactionIdIsInProgress() cache
Date: 2008-03-11 12:57:23
Message-ID: 47D681B3.5070309@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-patches

Simon Riggs wrote:
> We currently have a single item cache of the last checked TransactionId,
> which optimises the call to TransactionIdDidCommit() during
> HeapTupleSatisfiesMVCC() and partners.
>
> Before we call TransactionIdDidCommit() we always call
> TransactionIdIsInProgress().
>
> TransactionIdIsInProgress() doesn't check the single item cache, so even
> if we have just checked for this xid, we will check it again. Since this
> function takes ProcArrayLock and may be called while holding other locks
> it will improve scalability if we can skip the call, for the cost of an
> integer comparison.

Seems plausible. Have you done any performance testing?

I presume the case where this would help would be when you populate a
large table, with COPY for example, and the run a seq scan on it. As all
rows in the table have the same xmin, you keep testing for the same XID
over and over again.

To matter from scalability point of view, there would need to be a lot
of concurrent activity that compete for the lock. Can you formulate a
test case for that?

> Following patch implements fastpath in TransactionIdIsInProgress() to
> utilise single item cache.

Hmm. The pattern in tqual.c is:

> if (!TransactionIdIsInProgress(xvac))
> {
> if (TransactionIdDidCommit(xvac))
> {
> /* committed */
> }
> else
> {
> /* aborted */
> }
> }
> else
> {
> /* in-progress */
> }

We could do this instead:

> if (TransactionIdDidCommit(xvac))
> {
> /* committed */
> }
> else if (!TransactionIdIsInProgress(xvac))
> {
> if (TransactionIdDidCommit(xvac))
> {
> /* committed */
> }
> else
> {
> /* aborted */
> }
> }
> else
> {
> /* in-progress */
> }

(hopefully there would be a way to macroize that or something to avoid
bloating the code any more.)

For committed transactions, this would save the
TransactionIdIsInProgress call completely, whether or not it's in the
one-item cache. The tradeoff is that we would have to call
TransactionIdDidCommit twice for aborted transactions.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-patches by date

  From Date Subject
Next Message Heikki Linnakangas 2008-03-11 13:04:35 Re: Bulk Insert tuning
Previous Message Heikki Linnakangas 2008-03-11 12:34:07 Re: [PERFORM] Very slow (2 tuples/second) sequential scan after bulk insert; speed returns to ~500 tuples/second after commit