Re: Speedup twophase transactions

From: Nikhil Sontakke <nikhils(at)2ndquadrant(dot)com>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: David Steele <david(at)pgmasters(dot)net>, Stas Kelvich <s(dot)kelvich(at)postgrespro(dot)ru>, Simon Riggs <simon(at)2ndquadrant(dot)com>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>, Jesper Pedersen <jesper(dot)pedersen(at)redhat(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: Speedup twophase transactions
Date: 2017-03-15 07:48:29
Message-ID: CAMGcDxfKYCGkHytFOgbRpS3+RXGkGeJdHvCruwaYnOgfgCvadQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> Thanks for the new version. Let's head toward a final patch.
>
>
:-)

>
>

>
> Have added a new function to do this now. It reads either from disk or
> > shared memory and produces error/log messages accordingly as well now.
>
> And that's ProcessTwoPhaseBufferAndReturn in the patch.
> ProcessTwoPhaseBuffer may be a better name.
>
>
Renamed to ProcessTwoPhaseBuffer()

>
> Since we are using the TwoPhaseState structure to track redo entries, at
> end
> > of recovery, we will find existing entries. Please see my comments above
> for
> > RecoverPreparedTransactions()
>
> This is absolutely not good, because it is a direct result of the
> interactions of the first loop of RecoverPreparedTransaction() with
> its second loop, and because MarkAsPreparing() can finished by being
> called *twice* from the same transaction. I really think that this
> portion should be removed and that RecoverPreparedTransactions()
> should be more careful when scanning the entries in pg_twophase by
> looking up at what exists as well in shared memory, instead of doing
> that in MarkAsPreparing().
>
>
Ok. Modified MarkAsPreparing() to call a new MarkAsPreparingGuts()
function. This function takes in a "gxact' and works on it.

RecoverPreparedTransaction() now calls a newly added
RecoverFromTwoPhaseBuffer() function which checks if an entry already
exists via redo and calls the MarkAsPreparingGuts() function by passing in
that gxact. Otherwise the existing MarkAsPreparing() gets called.

> Here are some more comments:
>
> + /*
> + * Recreate its GXACT and dummy PGPROC
> + */
> + gxactnew = MarkAsPreparing(xid, gid,
> + hdr->prepared_at,
> + hdr->owner, hdr->database,
> + gxact->prepare_start_lsn,
> + gxact->prepare_end_lsn);
> +
> + Assert(gxactnew == gxact);
> Here it would be better to set ondisk to false. This makes the code
> more consistent with the previous loop, and the intention clear.
>
>
Done.

> The first loop of RecoverPreparedTransactions() has a lot in common
> with its second loop. You may want to refactor a little bit more here.
>
>
Done. Added the new function RecoverFromTwoPhaseBuffer() as mentioned above.

> +/*
> + * PrepareRedoRemove
> + *
> + * Remove the corresponding gxact entry from TwoPhaseState. Also
> + * remove the 2PC file.
> + */
> This could be a bit more expanded. The removal of the 2PC does not
> happen after removing the in-memory data, it would happen if the
> in-memory data is not found.
>
>
Done

> +MarkAsPreparingInRedo(TransactionId xid, const char *gid,
> + TimestampTz prepared_at, Oid owner, Oid databaseid,
> + XLogRecPtr prepare_start_lsn, XLogRecPtr prepare_end_lsn)
> +{
> + GlobalTransaction gxact;
> +
> + LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE);
> MarkAsPreparingInRedo is internal to twophase.c. There is no need to
> expose it externally and it is just used in PrepareRedoAdd so you
> could just group both.
>
>
Removed this MarkAsPreparingInRedo() function and inlined the code in
PrepareRedoAdd().

bool valid; /* TRUE if PGPROC entry is in proc array */
> bool ondisk; /* TRUE if prepare state file is on disk */
> + bool inredo; /* TRUE if entry was added via xlog_redo */
> We could have a set of flags here, that's the 3rd boolean of the
> structure used for a status.
>

This is more of a cleanup and does not need to be part of this patch. This
can be a follow-on cleanup patch.

I also managed to do some perf testing.

Modified Stas' earlier scripts slightly:

\set naccounts 100000 * :scale

\set from_aid random(1, :naccounts)

\set to_aid random(1, :naccounts)

\set delta random(1, 100)

\set scale :scale+1

BEGIN;

UPDATE pgbench_accounts SET abalance = abalance - :delta WHERE aid =
:from_aid;

UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid =
:to_aid;

PREPARE TRANSACTION ':client_id.:scale';

COMMIT PREPARED ':client_id.:scale';

Created a base backup with scale factor 125 on an AWS t2.large instance.
Set up archiving and did a 20 minute run with the above script saving the
WALs in the archive.

Then used recovery.conf to point to this WAL location and used the base
backup to recover.

With this patch applied: 20s

Without patch: Stopped measuring after 5 minutes ;-)

Regards,

Nikhils

--
Nikhil Sontakke http://www.2ndQuadrant.com/
PostgreSQL/Postgres-XL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
twophase_recovery_shmem_150317.patch application/octet-stream 32.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Aleksander Alekseev 2017-03-15 09:04:11 Re: [PATCH] Suppress Clang 3.9 warnings
Previous Message Amit Langote 2017-03-15 07:42:46 Re: New procedural language