Re: Perform streaming logical transactions by background workers and parallel apply

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Perform streaming logical transactions by background workers and parallel apply
Date: 2022-10-06 08:06:30
Message-ID: CAD21AoBLPDPCE4jj16ZRE9J8AoxLtvo3yasToaP5kRp6++=yyg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Sep 27, 2022 at 9:26 PM houzj(dot)fnst(at)fujitsu(dot)com
<houzj(dot)fnst(at)fujitsu(dot)com> wrote:
>
> On Saturday, September 24, 2022 7:40 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > On Thu, Sep 22, 2022 at 3:41 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
> > wrote:
> > >
> > > On Thu, Sep 22, 2022 at 8:59 AM wangw(dot)fnst(at)fujitsu(dot)com
> > > <wangw(dot)fnst(at)fujitsu(dot)com> wrote:
> > > >
> > >
> > > Few comments on v33-0001
> > > =======================
> > >
> >
> > Some more comments on v33-0001
> > =============================
> > 1.
> > + /* Information from the corresponding LogicalRepWorker slot. */
> > + uint16 logicalrep_worker_generation;
> > +
> > + int logicalrep_worker_slot_no;
> > +} ParallelApplyWorkerShared;
> >
> > Both these variables are read/changed by leader/parallel workers without
> > using any lock (mutex). It seems currently there is no problem because of the
> > way the patch is using in_parallel_apply_xact but I think it won't be a good idea
> > to rely on it. I suggest using mutex to operate on these variables and also check
> > if the slot_no is in a valid range after reading it in parallel_apply_free_worker,
> > otherwise error out using elog.
>
> Changed.
>
> > 2.
> > static void
> > apply_handle_stream_stop(StringInfo s)
> > {
> > - if (!in_streamed_transaction)
> > + ParallelApplyWorkerInfo *winfo = NULL; TransApplyAction apply_action;
> > +
> > + if (!am_parallel_apply_worker() &&
> > + (!in_streamed_transaction && !stream_apply_worker))
> > ereport(ERROR,
> > (errcode(ERRCODE_PROTOCOL_VIOLATION),
> > errmsg_internal("STREAM STOP message without STREAM START")));
> >
> > This check won't be able to detect missing stream start messages for parallel
> > apply workers apart from the first pair of start/stop. I thought of adding
> > in_remote_transaction check along with
> > am_parallel_apply_worker() to detect the same but that also won't work
> > because the parallel worker doesn't reset it at the stop message.
> > Another possibility is to introduce yet another variable for this but that doesn't
> > seem worth it. I would like to keep this check simple.
> > Can you think of any better way?
>
> I feel we can reuse the in_streamed_transaction in parallel apply worker to
> simplify the check there. I tried to set this flag in parallel apply worker
> when stream starts and reset it when stream stop so that we can directly check
> this flag for duplicate stream start message and other related things.
>
> > 3. I think we can skip sending start/stop messages from the leader to the
> > parallel worker because unlike apply worker it will process only one
> > transaction-at-a-time. However, it is not clear whether that is worth the effort
> > because it is sent after logical_decoding_work_mem changes. For now, I have
> > added a comment for this in the attached patch but let me if I am missing
> > something or if I am wrong.
>
> I the suggested comments look good.
>
> > 4.
> > postgres=# select pid, leader_pid, application_name, backend_type from
> > pg_stat_activity;
> > pid | leader_pid | application_name | backend_type
> > -------+------------+------------------+------------------------------
> > 27624 | | | logical replication launcher
> > 17336 | | psql | client backend
> > 26312 | | | logical replication worker
> > 26376 | | psql | client backend
> > 14004 | | | logical replication worker
> >
> > Here, the second worker entry is for the parallel worker. Isn't it better if we
> > distinguish this by keeping type as a logical replication parallel worker? I think
> > for this you need to change bgw_type in logicalrep_worker_launch().
>
> Changed.
>
> > 5. Can we name parallel_apply_subxact_info_add() as
> > parallel_apply_start_subtrans()?
> >
> > Apart from the above, I have added/edited a few comments and made a few
> > other cosmetic changes in the attached.
>

While looking at v35 patch, I realized that there are some cases where
the logical replication gets stuck depending on partitioned table
structure. For instance, there are following tables, publication, and
subscription:

* On publisher
create table p (c int) partition by list (c);
create table c1 partition of p for values in (1);
create table c2 (c int);
create publication test_pub for table p, c1, c2 with
(publish_via_partition_root = 'true');

* On subscriber
create table p (c int) partition by list (c);
create table c1 partition of p for values In (2);
create table c2 partition of p for values In (1);
create subscription test_sub connection 'port=5551 dbname=postgres'
publication test_pub with (streaming = 'parallel', copy_data =
'false');

Note that while both the publisher and the subscriber have the same
name tables the partition structure is different and rows go to a
different table on the subscriber (eg, row c=1 will go to c2 table on
the subscriber). If two current transactions are executed as follows,
the apply worker (ig, the leader apply worker) waits for a lock on c2
held by its parallel apply worker:

* TX-1
BEGIN;
INSERT INTO p SELECT 1 FROM generate_series(1, 10000); --- changes are streamed

* TX-2
BEGIN;
TRUNCATE c2; --- wait for a lock on c2

* TX-1
INSERT INTO p SELECT 1 FROM generate_series(1, 10000);
COMMIT;

This might not be a common case in practice but it could mean that
there is a restriction on how partitioned tables should be structured
on the publisher and the subscriber when using streaming = 'parallel'.
When this happens, since the logical replication cannot move forward
the users need to disable parallel-apply mode or increase
logical_decoding_work_mem. We could describe this limitation in the
doc but it would be hard for users to detect problematic table
structure.

BTW, when the leader apply worker waits for a lock on c2 in the above
example, the parallel apply worker is in a busy-loop, which should be
fixed.

Regards,

--
Masahiko Sawada
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2022-10-06 08:09:42 Re: use has_privs_of_role() for pg_hba.conf
Previous Message Bharath Rupireddy 2022-10-06 08:03:33 Re: An attempt to avoid locally-committed-but-not-replicated-to-standby-transactions in synchronous replication