Re: [HACKERS] logical decoding of two-phase transactions

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Peter Smith <smithpb2250(at)gmail(dot)com>
Cc: vignesh C <vignesh21(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] logical decoding of two-phase transactions
Date: 2021-03-15 03:03:39
Message-ID: CAA4eK1LVEdPYnjdajYzu3k6KEii1+F0jdQ6sWnYugiHcSGZD6Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Mar 12, 2021 at 2:09 PM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
>
> Please find attached the latest patch set v58*
>

In this patch-series, I see a problem with synchronous replication
when GUC 'synchronous_standby_names' is configured to use subscriber.
This will allow Prepares and Commits to wait for the subscriber to
finish. Before this patch, we never send prepare as two-phase was not
enabled by a subscriber, so it won't wait for it, rather it will make
progress because we send keep_alive messages. But after this patch, it
will start waiting for Prepare to finish. Now, without spool-file
logic, it will work because prepares are decoded on subscriber and a
corresponding ack will be sent to a publisher but for the spool-file
case, we will wait for Publisher to send commit prepared and in
publisher prepare is not finished because we are waiting for its ack.
So, it will create a sort of deadlock. This is related to the problem
as mentioned in the below comments in the patch:
+ * A future release may be able to detect when all tables are READY and set
+ * a flag to indicate this subscription/slot is ready for two_phase
+ * decoding. Then at the publisher-side, we could enable wait-for-prepares
+ * only when all the slots of WALSender have that flag set.

The difference is that it can happen now itself, prepares
automatically wait if 'synchronous_standby_names' is set. Now, we can
imagine a solution where after spooling to file the changes which
can't be applied during syncup phase, we update the flush location so
that publisher can proceed with that prepare. But I think that won't
work because once we have updated the flush location those prepares
won't be sent again and it is quite possible that we don't have
complete relation information as the schema is not sent with each
transaction. Now, we can go one step further and try to remember the
schema information the first time it is sent so that it can be reused
after restart but I think that will complicate the patch and overall
design.

I think there is a simpler solution to these problems. The idea is to
enable two_phase after the initial sync is over (all relations are in
a READY state). If we switch-on the 2PC only after all the relations
come to the READY state then we shouldn't get any prepare before
sync-point. However, it is quite possible that before reaching
syncpoint, the slot corresponding to apply-worker has skipped because
2PC was not enabled, and afterward, prepare would be skipped because
by that start_decoding_at might have moved. See the explanation in an
email: https://www.postgresql.org/message-id/CAA4eK1LuK4t-ZYYCY7k9nMoYP%2Bdwi-JyqUdtcffQMoB_g5k6Hw%40mail.gmail.com.
Now, even the initial_consistent_point won't help because for
apply-worker, it will be different from tablesync slot's
initial_consistent_point and we would have reached initial consistency
earlier for apply-workers.

To solve the main problem (how to detect the prepares that are skipped
when we toggled the two_pc option) in the above idea, we can mark an
LSN position in the slot (two_phase_at, this will be the same as
start_decoding_at point when we receive slot with 2PC option) where we
enable two_pc. If we encounter any commit prepared whose prepare LSN
is less than two_phase_at, then we need to send prepare for the
transaction along with commit prepared.

For this solution on the subscriber-side, I think we need a tri-state
column (two_phase) in pg_subscription. It can have three values
'disable', 'can_enable', 'enable'. By default, it will be 'disable'.
If the user enables 2PC, then we can set it to 'can_enable' and once
we see all relations are in a READY state, restart the apply-worker
and this time while starting the streaming, send the two_pc option and
then we can change the state to 'enable' so that future restarts won't
send this option again. Now on the publisher side, if this option is
present, it will change the value of two_phase_at in the slot to
start_decoding_at. I think something on these lines should be much
easier than the spool-file implementation unless we see any problem
with this idea.

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2021-03-15 03:17:28 Re: recovering from "found xmin ... from before relfrozenxid ..."
Previous Message Thomas Munro 2021-03-15 02:47:15 Re: Change JOIN tutorial to focus more on explicit joins