Re: Column Filtering in Logical Replication

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Petr Jelinek <petr(dot)jelinek(at)enterprisedb(dot)com>
Cc: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Rahila Syed <rahilasyed90(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>
Subject: Re: Column Filtering in Logical Replication
Date: 2022-04-18 11:04:06
Message-ID: CAA4eK1+fbCPjoFUO-FYDkosfzs0he4=AJtJNcQkOxV53=39jJg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Apr 14, 2022 at 9:09 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Thu, Apr 14, 2022 at 8:32 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > >
> > > The other part of the puzzle is the below check in the code:
> > > /*
> > > * If we reached the sync worker limit per subscription, just exit
> > > * silently as we might get here because of an otherwise harmless race
> > > * condition.
> > > */
> > > if (nsyncworkers >= max_sync_workers_per_subscription)
> > >
> > > It is not clear to me why this check is there, if this wouldn't be
> > > there, the user would have got either a WARNING to increase the
> > > max_logical_replication_workers or the apply worker would have been
> > > restarted. Do you have any idea about this?
> >
> > Yeah, I'm also puzzled with this check. It seems that this function
> > doesn't work well when the apply worker is not running and some
> > tablesync workers are running. I initially thought that the apply
> > worker calls to this function as many as tables that needs to be
> > synced, but it checks the max_sync_workers_per_subscription limit
> > before calling to logicalrep_worker_launch(). So I'm not really sure
> > we need this check.
> >
>
> I just hope that the original author Petr J. responds to this point. I
> have added him to this email. This will help us to find the best
> solution for this problem.
>

I did some more investigation for this code. It is added by commit [1]
and the patch that led to this commit is first time posted on -hackers
in email [2]. Now, neither the commit message nor the patch (comments)
gives much idea as to why this part of code is added but I think there
is some hint in the email [2]. In particular, read the paragraph in
the email [2] that has the lines: ".... and limiting sync workers per
subscription theoretically wasn't either (although I don't think it
could happen in practice).".

It seems that this check has been added to theoretically limit the
sync workers even though that can't happen because apply worker
ensures that before trying to launch the sync worker. Does this theory
make sense to me? If so, I think we can change the check as: "if
(OidIsValid(relid) && nsyncworkers >=
max_sync_workers_per_subscription)" in launcher.c. This will serve the
purpose of the original code and will solve the issue being discussed
here. I think we can even backpatch this. What do you think?

[1]
commit de4389712206d2686e09ad8d6dd112dc4b6c6d42
Author: Peter Eisentraut <peter_e(at)gmx(dot)net>
Date: Wed Apr 26 10:43:04 2017 -0400

Fix various concurrency issues in logical replication worker launching

[2] - https://www.postgresql.org/message-id/fa387e24-0e26-c02d-ef16-7e46ada200dd%402ndquadrant.com

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jesper Pedersen 2022-04-18 11:10:46 Re: GSoC: pgagroal: SCRAM-SHA-256-PLUS support (2022)
Previous Message Thomas Munro 2022-04-18 10:45:07 Re: Crash in new pgstats code