Re: Perform streaming logical transactions by background workers and parallel apply

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>
Cc: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Perform streaming logical transactions by background workers and parallel apply
Date: 2022-11-12 11:05:46
Message-ID: CAA4eK1Jom_hmw19YrtEmZ4MswSsu8imi_JSsRcqgz7G76jvs6Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Nov 11, 2022 at 2:12 PM houzj(dot)fnst(at)fujitsu(dot)com
<houzj(dot)fnst(at)fujitsu(dot)com> wrote:
>

Few comments on v46-0001:
======================
1.
+static void
+apply_handle_stream_abort(StringInfo s)
{
...
+ /* Send STREAM ABORT message to the parallel apply worker. */
+ parallel_apply_send_data(winfo, s->len, s->data);
+
+ if (abort_toplevel_transaction)
+ {
+ parallel_apply_unlock_stream(xid, AccessExclusiveLock);

Shouldn't we need to release this lock before sending the message as
we are doing for streap_prepare and stream_commit? If there is a
reason for doing it differently here then let's add some comments for
the same.

2. It seems once the patch makes the file state as busy
(LEADER_FILESET_BUSY), it will only be accessible after the leader
apply worker receives a transaction end message like stream_commit. Is
my understanding correct? If yes, then why can't we make it accessible
after the stream_stop message? Are you worried about the concurrency
handling for reading and writing the file? If so, we can probably deal
with it via some lock for reading and writing to file for each change.
I think after this we may not need additional stream level lock/unlock
in parallel_apply_spooled_messages. I understand that you probably
want to keep the code simple so I am not suggesting changing it
immediately but just wanted to know whether you have considered
alternatives here.

3. Don't we need to release the transaction lock at stream_abort in
parallel apply worker? I understand that we are not waiting for it in
the leader worker but still parallel apply worker should release it if
acquired at stream_start by it.

4. A minor comment change as below:
diff --git a/src/backend/replication/logical/worker.c
b/src/backend/replication/logical/worker.c
index 43f09b7e9a..c771851d1f 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -1851,6 +1851,9 @@ apply_handle_stream_abort(StringInfo s)
parallel_apply_stream_abort(&abort_data);

/*
+ * We need to wait after processing rollback
to savepoint for the next set
+ * of changes.
+ *
* By the time parallel apply worker is
processing the changes in
* the current streaming block, the leader
apply worker may have
* sent multiple streaming blocks. So, try to
lock only if there

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Zhang Mingli 2022-11-12 11:13:49 Remove unused param rte in set_plain_rel_pathlist
Previous Message Julien Rouhaud 2022-11-12 08:13:53 Re: Allow file inclusion in pg_hba and pg_ident files