Re: Speedup twophase transactions

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Nikhil Sontakke <nikhils(at)2ndquadrant(dot)com>
Cc: Stas Kelvich <s(dot)kelvich(at)postgrespro(dot)ru>, Simon Riggs <simon(at)2ndquadrant(dot)com>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>, Jesper Pedersen <jesper(dot)pedersen(at)redhat(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: Speedup twophase transactions
Date: 2017-02-23 07:20:34
Message-ID: CAB7nPqTn05yV-Fzt3HQo7A7fmrzpNxF7L0uiOrDfkivwLYQGGg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Feb 2, 2017 at 3:07 PM, Nikhil Sontakke <nikhils(at)2ndquadrant(dot)com> wrote:
>>> Yeah. Was thinking about this yesterday. How about adding entries in
>>> TwoPhaseState itself (which become valid later)? Only if it does not
>>> cause a lot of code churn.
>>
>> That's possible as well, yes.
>
> PFA a patch which does the above. It re-uses the TwoPhaseState gxact
> entries to track 2PC PREPARE/COMMIT in shared memory. The advantage
> being that CheckPointTwoPhase() becomes the only place where the fsync
> of 2PC files happens.
>
> A minor annoyance in the patch is the duplication of the code to add
> the 2nd while loop to go through these shared memory entries in
> PrescanPreparedTransactions, RecoverPreparedTransactions and
> StandbyRecoverPreparedTransactions.
>
> Other than this, I ran TAP tests and they succeed as needed.

Thanks for the new patch. Finally I am looking at it... The regression
tests already committed are all able to pass.

twophase.c: In function ‘PrepareRedoAdd’:
twophase.c:2539:20: warning: variable ‘gxact’ set but not used
[-Wunused-but-set-variable]
GlobalTransaction gxact;
There is a warning at compilation.

The comment at the top of PrescanPreparedTransactions() needs to be
updated. Not only does this routine look for the contents of
pg_twophase, but it does also look at the shared memory contents, at
least those ones marked with inredo and not on_disk.

+ ereport(WARNING,
+ (errmsg("removing future two-phase state data from
memory \"%u\"",
+ xid)));
+ PrepareRedoRemove(xid);
+ continue
Those are not normal (partially because unlink is atomic, but not
durable)... But they match the correct coding pattern regarding
incorrect 2PC entries... I'd really like to see those switched to a
FATAL with unlink() made durable for those calls.

+ /* Deconstruct header */
+ hdr = (TwoPhaseFileHeader *) buf;
+ Assert(TransactionIdEquals(hdr->xid, xid));
+
+ if (TransactionIdPrecedes(xid, result))
+ result = xid;
This portion is repeated three times and could be easily refactored.
You could just have a routine that returns the oldes transaction ID
used, and ignore the result for StandbyRecoverPreparedTransactions()
by casting a (void) to let any kind of static analyzer understand that
we don't care about the result for example. Handling for subxids is
necessary as well depending on the code path. Spliting things into a
could of sub-routines may be more readable as well. There are really a
couple of small parts that can be gathered and strengthened.

+ /*
+ * Recreate its GXACT and dummy PGPROC
+ */
+ gxactnew = MarkAsPreparing(xid, gid,
+ hdr->prepared_at,
+ hdr->owner, hdr->database,
+ gxact->prepare_start_lsn,
+ gxact->prepare_end_lsn);
MarkAsPreparing() does not need to be extended with two new arguments.
RecoverPreparedTransactions() is used only at the end of recovery,
where it is not necessary to look at the 2PC state files from the
records. In this code path inredo is also set to false :)

+ {
+ /*
+ * Entry could be on disk. Call with giveWarning=false
+ * since it can be expected during replay.
+ */
+ RemoveTwoPhaseFile(xid, false);
+ }
This would be removed at the end of recovery anyway as a stale entry,
so that's not necessary.

+ /* Delete TwoPhaseState gxact entry and/or 2PC file. */
+ PrepareRedoRemove(parsed.twophase_xid);
Both things should not be present, no? If the file is pushed to disk
it means that the checkpoint horizon has already moved.

- ereport(ERROR,
+ /* It's ok to find an entry in the redo/recovery case */
+ if (!gxact->inredo)
+ ereport(ERROR,
(errcode(ERRCODE_DUPLICATE_OBJECT),
errmsg("transaction identifier \"%s\" is already in use",
gid)));
+ else
+ {
+ found = true;
+ break;
+ }
I would not have thought so.

MarkAsPreparing and MarkAsPreparingInRedo really share the same code.
What if the setup of the dummy PGPROC entry is made conditional?
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2017-02-23 07:24:08 Re: DROP SUBSCRIPTION and ROLLBACK
Previous Message Yugo Nagata 2017-02-23 07:17:35 Re: Declarative partitioning - another take