RE: Logical replication timeout problem

From: "tanghy(dot)fnst(at)fujitsu(dot)com" <tanghy(dot)fnst(at)fujitsu(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Fabrice Chapuis <fabrice636861(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: RE: Logical replication timeout problem
Date: 2021-11-12 09:22:11
Message-ID: OS0PR01MB6113AAC410AA16C5C3D894EBFB959@OS0PR01MB6113.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Friday, November 12, 2021 2:24 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Thu, Nov 11, 2021 at 11:15 PM Fabrice Chapuis
> <fabrice636861(at)gmail(dot)com> wrote:
> >
> > Hello,
> > Our lab is ready now. Amit, I compile Postgres 10.18 with your patch.Tang, I
> used your script to configure logical replication between 2 databases and to
> generate 10 million entries in an unreplicated foo table. On a standalone instance
> no error message appears in log.
> > I activate the physical replication between 2 nodes, and I got following error:
> >
> > 2021-11-10 10:49:12.297 CET [12126] LOG: attempt to send keep alive
> message
> > 2021-11-10 10:49:12.297 CET [12126] STATEMENT: START_REPLICATION
> 0/3000000 TIMELINE 1
> > 2021-11-10 10:49:15.127 CET [12064] FATAL: terminating logical replication
> worker due to administrator command
> > 2021-11-10 10:49:15.127 CET [12036] LOG: worker process: logical replication
> worker for subscription 16413 (PID 12064) exited with exit code 1
> > 2021-11-10 10:49:15.155 CET [12126] LOG: attempt to send keep alive
> message
> >
> > This message look like strange because no admin command have been executed
> during data load.
> > I did not find any error related to the timeout.
> > The message coming from the modification made with the patch comes back all
> the time: attempt to send keep alive message. But there is no "sent keep alive
> message".
> >
> > Why logical replication worker exit when physical replication is configured?
> >
>
> I am also not sure why that happened may be due to
> max_worker_processes reaching its limit. This can happen because it
> seems you configured both publisher and subscriber in the same
> cluster. Tang, did you also see the same problem?
>

No.
I used the default max_worker_processes value, ran logical replication and
physical replication at the same time. I also changed the data in table on
publisher. But didn't see the same problem.

Regards
Tang

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Bharath Rupireddy 2021-11-12 10:14:00 Allow users to choose what happens when recovery target is not reached
Previous Message Kyotaro Horiguchi 2021-11-12 07:43:27 Re: standby recovery fails (tablespace related) (tentative patch and discussion)