Re: Perform streaming logical transactions by background workers and parallel apply

From: Peter Smith <smithpb2250(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>, "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Perform streaming logical transactions by background workers and parallel apply
Date: 2022-08-23 04:01:52
Message-ID: CAHut+PsDcJ6r4a=AvpfDN08tstU2Qwuz+=-h_uECufBmtFbHww@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Aug 22, 2022 at 7:01 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Mon, Aug 22, 2022 at 4:42 AM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
> >
> > On Fri, Aug 19, 2022 at 7:55 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > >
> > > On Fri, Aug 19, 2022 at 3:05 PM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
> > > >
> > > > On Fri, Aug 19, 2022 at 7:10 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > > > >
> > > > > On Fri, Aug 19, 2022 at 2:36 PM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
> > > > > >
> > > > > > Here are my review comments for the v23-0005 patch:
> > > > > >
> > > > > > ======
> > > > > >
> > > > > > Commit Message says:
> > > > > > main_worker_pid is Process ID of the main apply worker, if this process is a
> > > > > > apply background worker. NULL if this process is a main apply worker or a
> > > > > > synchronization worker.
> > > > > > The new column can make it easier to distinguish main apply worker and apply
> > > > > > background worker.
> > > > > >
> > > > > > --
> > > > > >
> > > > > > Having a column called ‘main_worker_pid’ which is defined to be NULL
> > > > > > if the process *is* the main apply worker does not make any sense to
> > > > > > me.
> > > > > >
> > > > >
> > > > > I haven't read this part of a patch but it seems to me we have
> > > > > something similar for parallel query workers. Refer 'leader_pid'
> > > > > column in pg_stat_activity.
> > > > >
> > > >
> > > > IIUC (from the patch 0005 commit message) the intention is to be able
> > > > to easily distinguish the worker types.
> > > >
> > >
> > > I think it is only to distinguish between leader apply worker and
> > > background apply workers. The tablesync worker can be distinguished
> > > based on relid field.
> > >
> >
> > Right. But that's the reason for my question in the first place - why
> > implement the patch so that the user still has to jump through hoops
> > just to know the worker type information?
> >
>
> I think it is not only to judge worker type but also to know the pid
> of each of the workers during parallel apply. Isn't it better to have
> both main apply worker pid and parallel apply worker pid as we have
> for the parallel query system?
>

OK, thanks for pointing me to that other view. Now that I see the
existing pg_stat_activity already has 'pid' and 'leader_pid' [1], it
suddenly seems more reasonable to do similar for this
pg_stat_subscription.

This background information needs to be conveyed better in the patch
0005 commit message. The current commit message said nothing about
trying to be consistent with the existing stats views; it only says
this field was added to distinguish more easily between the types of
apply workers.

------
[1] https://www.postgresql.org/docs/devel/monitoring-stats.html

Kind Regards,
Peter Smith.
Fujitsu Australia

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2022-08-23 04:13:07 Re: SQL/JSON features for v15
Previous Message Tom Lane 2022-08-23 03:43:14 Re: sockaddr_un.sun_len vs. reality