Re: Speedup twophase transactions

From: Nikhil Sontakke <nikhils(at)2ndquadrant(dot)com>
To: David Steele <david(at)pgmasters(dot)net>
Cc: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Stas Kelvich <s(dot)kelvich(at)postgrespro(dot)ru>, Simon Riggs <simon(at)2ndquadrant(dot)com>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>, Jesper Pedersen <jesper(dot)pedersen(at)redhat(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: Speedup twophase transactions
Date: 2017-03-11 10:26:25
Views: Raw Message | Whole Thread | Download mbox
Lists: pgsql-hackers

Hi David and Michael,

> It would be great to get this thread closed out after 14 months and many
> commits.
PFA, latest patch which addresses Michael's comments.

twophase.c: In function ‘PrepareRedoAdd’:
> twophase.c:2539:20: warning: variable ‘gxact’ set but not used
> [-Wunused-but-set-variable]
> GlobalTransaction gxact;
> There is a warning at compilation.

> The comment at the top of PrescanPreparedTransactions() needs to be
> updated. Not only does this routine look for the contents of
> pg_twophase, but it does also look at the shared memory contents, at
> least those ones marked with inredo and not on_disk.
Changed comments at top of PrescanPreparedTransactions() ,
StandbyRecoverPreparedTransactions(), and RecoverPreparedTransactions().

> + ereport(WARNING,
> + (errmsg("removing future two-phase state data from
> memory \"%u\"",
> + xid)));
> + PrepareRedoRemove(xid);
> + continue
> Those are not normal (partially because unlink is atomic, but not
> durable)... But they match the correct coding pattern regarding
> incorrect 2PC entries... I'd really like to see those switched to a
> FATAL with unlink() made durable for those calls.
Hmm, not sure what exactly we need to do here. If you look at the prior
checks, there we already skip on-disk entries. So, typically, the entries
that we encounter here will be in shmem only.

> + /* Deconstruct header */
> + hdr = (TwoPhaseFileHeader *) buf;
> + Assert(TransactionIdEquals(hdr->xid, xid));
> +
> + if (TransactionIdPrecedes(xid, result))
> + result = xid;
> This portion is repeated three times and could be easily refactored.
> You could just have a routine that returns the oldes transaction ID
> used, and ignore the result for StandbyRecoverPreparedTransactions()
> by casting a (void) to let any kind of static analyzer understand that
> we don't care about the result for example. Handling for subxids is
> necessary as well depending on the code path. Spliting things into a
> could of sub-routines may be more readable as well. There are really a
> couple of small parts that can be gathered and strengthened.
Have added a new function to do this now. It reads either from disk or
shared memory and produces error/log messages accordingly as well now.

> + /*
> + * Recreate its GXACT and dummy PGPROC
> + */
> + gxactnew = MarkAsPreparing(xid, gid,
> + hdr->prepared_at,
> + hdr->owner, hdr->database,
> + gxact->prepare_start_lsn,
> + gxact->prepare_end_lsn);
> MarkAsPreparing() does not need to be extended with two new arguments.
> RecoverPreparedTransactions() is used only at the end of recovery,
> where it is not necessary to look at the 2PC state files from the
> records. In this code path inredo is also set to false :)
That's not true. We will have entries with inredo set at the end of
recovery as well. Infact the MarkAsPreparing() call
from RecoverPreparedTransactions() is the one which will remove these
inredo entries and convert them into regular entries. We have optimized the
recovery code path as well.

> + /* Delete TwoPhaseState gxact entry and/or 2PC file. */
> + PrepareRedoRemove(parsed.twophase_xid);
> Both things should not be present, no? If the file is pushed to disk
> it means that the checkpoint horizon has already moved.
PREPARE in redo, followed by a checkpoint, followed by a COMMIT/ROLLBACK.
We can have both the bits set in this case.

> - ereport(ERROR,
> + /* It's ok to find an entry in the redo/recovery case */
> + if (!gxact->inredo)
> + ereport(ERROR,
> errmsg("transaction identifier \"%s\" is already in
> use",
> gid)));
> + else
> + {
> + found = true;
> + break;
> + }
> I would not have thought so.
> Since we are using the TwoPhaseState structure to track redo entries, at
end of recovery, we will find existing entries. Please see my comments
above for RecoverPreparedTransactions()

> MarkAsPreparing and MarkAsPreparingInRedo really share the same code.
> What if the setup of the dummy PGPROC entry is made conditional?

I realized that MarkAsPreparingInRedo() does not need to do all the sanity
checking since it's going to be invoked during redo and everything that
comes in is kosher already. So its contents are much simplified in this
latest patch.

Tests pass with this latest patch.


Attachment Content-Type Size
twophase_recovery_shmem_110317.patch application/octet-stream 29.5 KB

In response to


Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2017-03-11 10:59:00 Re: Explicit subtransactions for PL/Tcl
Previous Message Craig Ringer 2017-03-11 06:34:17 Re: Upgrading postmaster's log messages about bind/listen errors