Re: walsender bug: stuck during shutdown

From: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
To: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Chloe Dives <Chloe(dot)Dives(at)cantabcapital(dot)com>, Chris Wilson <chris(dot)wilson(at)cantabcapital(dot)com>
Subject: Re: walsender bug: stuck during shutdown
Date: 2020-12-04 18:27:07
Message-ID: 20201204182707.GA8461@alvherre.pgsql
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2020-Nov-26, Fujii Masao wrote:

> Yes, so the problem here is that walsender goes into the busy loop
> in that case. Seems this happens only in logical replication walsender.
> In physical replication walsender, WaitLatchOrSocket() in WalSndLoop()
> seems to work as expected and prevent it from entering into busy loop
> even in that case.
>
> /*
> * If postmaster asked us to stop, don't wait anymore.
> *
> * It's important to do this check after the recomputation of
> * RecentFlushPtr, so we can send all remaining data before shutting
> * down.
> */
> if (got_STOPPING)
> break;
>
> The above code in WalSndWaitForWal() seems to cause this issue. But I've
> not come up with idea about how to fix yet.

With DEBUG1 I observe that walsender is getting a lot of 'r' messages
(standby reply) with all zeroes:

2020-12-01 21:01:24.100 -03 [15307] DEBUG: write 0/0 flush 0/0 apply 0/0

However, while doing that I also observed that if I do send some
activity to the logical replication stream, with the provided program,
it will *still* have the 'write' pointer set to 0/0, and the 'flush'
pointer has moved forward to what was sent. I'm not clear on what
causes the write pointer to move forward in logical replication.

Still, the previously proposed patch does resolve the problem in either
case.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephen Frost 2020-12-04 18:27:38 Re: WIP: WAL prefetch (another approach)
Previous Message Tom Lane 2020-12-04 18:25:21 Re: [HACKERS] [PATCH] Generic type subscripting