Re: Perform streaming logical transactions by background workers and parallel apply

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Peter Smith <smithpb2250(at)gmail(dot)com>
Cc: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Perform streaming logical transactions by background workers and parallel apply
Date: 2022-08-18 10:20:37
Message-ID: CAA4eK1JR2GR9jjaz9T1ZxzgLVS0h089EE8ZB=F2EsVHbM_5sfA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Aug 18, 2022 at 3:40 PM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
>
> On Thu, Aug 18, 2022 at 6:57 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > > 47. src/include/replication/logicalproto.h
> > >
> > > @@ -32,12 +32,17 @@
> > > *
> > > * LOGICALREP_PROTO_TWOPHASE_VERSION_NUM is the minimum protocol version with
> > > * support for two-phase commit decoding (at prepare time). Introduced in PG15.
> > > + *
> > > + * LOGICALREP_PROTO_STREAM_PARALLEL_VERSION_NUM is the minimum protocol version
> > > + * with support for streaming large transactions using apply background
> > > + * workers. Introduced in PG16.
> > > */
> > > #define LOGICALREP_PROTO_MIN_VERSION_NUM 1
> > > #define LOGICALREP_PROTO_VERSION_NUM 1
> > > #define LOGICALREP_PROTO_STREAM_VERSION_NUM 2
> > > #define LOGICALREP_PROTO_TWOPHASE_VERSION_NUM 3
> > > -#define LOGICALREP_PROTO_MAX_VERSION_NUM LOGICALREP_PROTO_TWOPHASE_VERSION_NUM
> > > +#define LOGICALREP_PROTO_STREAM_PARALLEL_VERSION_NUM 4
> > > +#define LOGICALREP_PROTO_MAX_VERSION_NUM
> > > LOGICALREP_PROTO_STREAM_PARALLEL_VERSION_NUM
> > >
> > > 47a.
> > > I don't think that comment is strictly true. IIUC the new protocol
> > > version 4 is currently only affecting the *extra* STREAM_ABORT members
> > > – but in fact streaming=parallel is still functional without using
> > > those extra members, isn't it? So maybe this description needed to be
> > > modified a bit to be more accurate?
> > >
> >
> > The reason for sending this extra abort members is to ensure that
> > after aborting the transaction, if the subscriber/apply worker
> > restarts, it doesn't need to request the transaction again. Do you
> > have suggestions for improving this comment?
> >
>
> I gave three review comments for v21-0001 that were all related to
> this same point:
> i- #4b (commit message)
> ii- #7 (protocol pgdocs)
> iii- #47a (code comment)
>
> The point was:
> AFAIK protocol 4 is only to let the parallel streaming logic behave
> *better* in how it can handle restarts after aborts. But that does not
> mean that protocol 4 is a *pre-requisite* for "allowing"
> streaming=parallel to work in the first place. I thought that a PG15
> publisher and PG16 subscriber can still work using streaming=parallel
> even with protocol 3, but it just won't be quite as clever for
> handling restarts after abort as protocol 4 (PG16 -> PG16) would be.
>

It is not only that it makes it better but one can say that it is a
break of a replication protocol that after the client (subscriber) has
applied some transaction, it requests the same transaction again. So,
I think it is better to make the parallelism work only when the server
version is also 16.

--
With Regards,
Amit Kapila.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Anant ngo 2022-08-18 10:42:45 Data caching
Previous Message Peter Smith 2022-08-18 10:10:21 Re: Perform streaming logical transactions by background workers and parallel apply