RE: Exit walsender before confirming remote flush in logical replication

From: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>
To: 'Alexander Lakhin' <exclusion(at)gmail(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Andrey Silitskiy <a(dot)silitskiy(at)postgrespro(dot)ru>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Greg Sabino Mullane <htamfids(at)gmail(dot)com>, Japin Li <japinli(at)hotmail(dot)com>, Ronan Dunklau <ronan(at)dunklau(dot)fr>, Vitaly Davydov <v(dot)davydov(at)postgrespro(dot)ru>, "Takamichi Osumi (Fujitsu)" <osumi(dot)takamichi(at)fujitsu(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, "sawada(dot)mshk(at)gmail(dot)com" <sawada(dot)mshk(at)gmail(dot)com>, "michael(at)paquier(dot)xyz" <michael(at)paquier(dot)xyz>, "peter(dot)eisentraut(at)enterprisedb(dot)com" <peter(dot)eisentraut(at)enterprisedb(dot)com>, "dilipbalaut(at)gmail(dot)com" <dilipbalaut(at)gmail(dot)com>, "amit(dot)kapila16(at)gmail(dot)com" <amit(dot)kapila16(at)gmail(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>
Subject: RE: Exit walsender before confirming remote flush in logical replication
Date: 2026-06-01 03:57:27
Message-ID: OS9PR01MB1214984918640A7AC5BEE2A0AF5152@OS9PR01MB12149.jpnprd01.prod.outlook.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Dear Alexander, Fujii-san,

> 038_walsnd_shutdown_timeout_subscriber.log doesn't really contain the
> expected warning:
> 2026-05-29 21:11:58.777 CEST [2273817][logical replication apply worker][124/2:0] LOG:  logical replication apply worker for subscription "test_sub" has started
> 2026-05-29 21:12:03.232 CEST [2271870][client backend][4/6:0] LOG:  statement: BEGIN;
> 2026-05-29 21:12:03.232 CEST [2271870][client backend][4/6:0] LOG:  statement: LOCK TABLE test_tab IN EXCLUSIVE MODE;
...

To confirm; IIUC the warning should be contained on the publisher log, not the
subscriber side. And below log appeared on the publisher;

```
2026-05-29 21:12:03.426 CEST [2275591][walsender][26/1:0] FATAL: canceling authentication due to timeout
2026-05-29 21:12:03.432 CEST [2273580][checkpointer][:0] LOG: shutting down
```

Is there a possibility that walsender was shut down during the authentication,
especially in-between BackendInitialize() and end of PerformAuthentication()?

> I think this can be explained by the fact that walrcv->ready_to_display
> is set before WalReceiverMain's loop reached. I've reproduced this test
> failure with:

Verified it could reproduce the failure, but there were no "canceling authentication
due to timeout" in the publisher log on my env.

Best regards,
Hayato Kuroda
FUJITSU LIMITED

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2026-06-01 04:04:35 Re: [PATCH] Fix libxml leaks in contrib/xml2 XPath functions
Previous Message Tom Lane 2026-06-01 03:56:12 Re: [PATCH]Refactor and unify expression construction functions in makefuncs.c