Re: 2PC support for pglogical

From: Craig Ringer <craig(at)2ndquadrant(dot)com>
To: Stas Kelvich <s(dot)kelvich(at)postgrespro(dot)ru>
Cc: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: 2PC support for pglogical
Date: 2016-03-23 05:44:57
Message-ID: CAMsr+YGyDDjn=oMPDDsVdrMb3E_Vx_ek8zA3Tn+RFiGep7Yh-Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 10 March 2016 at 22:50, Stas Kelvich <s(dot)kelvich(at)postgrespro(dot)ru> wrote:

> Hi.
>
> Here is proof-of-concept version of two phase commit support for logical
> replication.

I missed this when you posted it, so sorry for the late response.

I've read through this but not tested it yet. I really appreciate you doing
it, it's been on my wishlist/backburner for ages.

On reading through the patch I noticed that there doesn't seem to be any
consideration of locking. The prepared xact can still hold strong locks on
catalogs. How's that handled? I think Robert's group locking stuff is what
we'll want here - for a prepared xact we can join the prepared xact as a
group lock member so we inherit its locks. Right now if you try DDL in a
prepared xact I suspect it'll break.

On a more minor note, I think the pglogical_output and pglogical changes
should be a separate patch to the patch to core PostgreSQL. Keep them
separate. "git format-patch" on a tree is good for this.

A few misc comments on my first read-through:

Shouldn't PGLOGICAL_COMMIT etc be an enum { } ?

+/*
+ * ParsePrepareRecord
+ */

isn't a super-helpful comment.

Not sure what this means:

+ /* Ignoring abortrels? */
+ bufptr += MAXALIGN(hdr->nabortrels * sizeof(RelFileNode));

typo:

+ ... shortcut for coomiting bare xact ...

DecodePrepare seems to be mostly a copy of DecodeCommit. Can that be done
with less duplication?

Not sure I understand the rationale of "bare xact" as terminology/naming,
e.g. ReorderBufferCommitBareXact . I guess you're getting at the fact that
it's just a commit or rollback, not data. Phase2 ? 2PC?

> * Seems that only reliable way to get GID during replay of commit/rollback
> prepared is to force postgres to write GID in corresponding records,
> otherwise we can lose correspondence between xid and gid if we are
> replaying data from wal sender while transaction was commited some time
> ago. So i’ve changed postgres to write gid’s not only on prepare, but also
> on commit/rollback prepared. That should be done only in logical level, but
> now I just want to here some other opinions on that.
>

That sounds sensible.

> * Abort prepared xlog record also lack database information. Normally
> logical decoding just cleans reorder buffer when facing abort, but in case
> of 2PC we should send it to callbacks anyway. So I’ve added that info to
> abort records.
>

I was wondering what that was. Again, seems sensible, though I'm not
totally convinced of the naming.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2016-03-23 06:01:03 Re: Some messages of pg_rewind --debug not getting translated
Previous Message Tatsuo Ishii 2016-03-23 05:20:04 Re: multivariate statistics v14