Re: [HACKERS] logical decoding of two-phase transactions

From: Nikhil Sontakke <nikhils(at)2ndquadrant(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Petr Jelinek <petr(dot)jelinek(at)2ndquadrant(dot)com>, Sokolov Yura <y(dot)sokolov(at)postgrespro(dot)ru>, Stas Kelvich <s(dot)kelvich(at)postgrespro(dot)ru>, Andres Freund <andres(at)anarazel(dot)de>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: [HACKERS] logical decoding of two-phase transactions
Date: 2017-12-19 08:37:41
Message-ID: CAMGcDxca0zLfpMWTUV+1QT2NjZw+w2Zfzc2NKGfOH41UwFjQmw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> I think we would need to fire invalidations at COMMIT PREPARED, yet
> logically decode them at PREPARE.
>
Yes, we need invalidations to logically decode at PREPARE and then we need
invalidations to be executed at COMMIT PREPARED time as well.

DecodeCommit() needs to know when it's processing a COMMIT PREPARED
whether this transaction was decoded at PREPARE time.The main issue is
that we cannot expect the ReorderBufferTXN structure which was created
at PREPARE time to be around till the COMMIT PREPARED gets called. The
patch earlier was not cleaning this structure at PREPARE and was
adding an is_prepared flag to it so that COMMIT PREPARED knew that it
was decoded at PREPARE time. This structure can very well be not
around when you restart between PREPARE and COMMIT PREPARED, for
example.

So now, it's the onus of the prepare filter callback to always give us
the answer if a given transaction was decoded at PREPARE time or not.
We now hand over the ReorderBufferTxn structure (it can be NULL), xid
and gid and the prepare filter tells us what to do. Always. The
is_prepared flag can be cached in the txn structure to aid in
re-lookups, but if it's not set, the filter could do xid lookup, gid
inspection and other shenanigans to give us the same answer every
invocation around.

Because of the above, we can very well cleanup the ReorderBufferTxn at
PREPARE time and it need not hang around till COMMIT PREPARED gets
called, which is a good win in terms of resource management.

My test cases pass (including the scenario described earlier) with the
above code changes in place.

I have also added crash testing related TAP test cases, they uncovered
a bug in the prepare redo restart code path which I fixed. I believe
this patch is in very stable state now. Multiple runs of the crash TAP
test pass without issues. Multiple runs of "make check-world" with
cassert enabled also pass without issues.

Note that this patch does not contain the HeapTupleSatisfiesVacuum
changes. I believe we need changes to HeapTupleSatisfiesVacuum given
than logical decoding changes the assumption that catalog tuples
belonging to a transaction which never committed can be reclaimed
immediately. With 2PC logical decoding or streaming logical decoding,
we can always have a split time window in which the ongoing decode
cycle needs those tuples. The solution is that even for aborted
transactions, we do not return HEAPTUPLE_DEAD if the transaction id is
newer than the OldestXmin (same logic we use for deleted tuples of
committed transactions). We can do this only for catalog table rows
(both system and user defined) to limit the scope of impact. In any
case, this needs to be a separate patch along with a separate
discussion thread.

Peter, I will submit a follow-on patch with documentation changes
soon. But this patch is complete IMO, with all the required 2PC
logical decoding functionality.

Comments, feedback is most welcome.

Regards,
Nikhils
--
Nikhil Sontakke http://www.2ndQuadrant.com/
PostgreSQL/Postgres-XL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
2pc_logical_19_12_17_without_docs.patch application/octet-stream 103.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Beena Emerson 2017-12-19 08:44:57 Re: [HACKERS] Runtime Partition Pruning
Previous Message Amit Kapila 2017-12-19 08:36:48 Re: access/parallel.h lacks PGDLLIMPORT