From: | Michael Paquier <michael(at)paquier(dot)xyz> |
---|---|
To: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
Cc: | "Callahan, Drew" <callaan(at)amazon(dot)com>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org> |
Subject: | Re: BUG #18155: Logical Apply Worker Timeout During TableSync Causes Either Stuckness or Data Loss |
Date: | 2023-10-18 01:22:14 |
Message-ID: | ZS8zRgKfB7AcxJWv@paquier.xyz |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
On Tue, Oct 17, 2023 at 09:50:52AM +0530, Amit Kapila wrote:
> On Tue, Oct 17, 2023 at 4:46 AM Callahan, Drew <callaan(at)amazon(dot)com> wrote:
>> On the server side, we did not see evidence of WALSenders being launched. As a result, the gap kept increasing further
>> and further since they workers would not transition to the catchup state after several hours due to this.
>
> One possibility is that the system has reached
> 'max_logical_replication_workers' limit due to which it is not
> allowing to launch the apply worker. If so, then consider increasing
> the value of 'max_logical_replication_workers'. You can query
> 'pg_stat_subscription' to know more information about workers. See the
> description of subscriber-side parameters [1].
Hmm. So you basically mean that not being able to launch new workers
prevents the existing workers to move on with their individual sync,
freeing slots once their sync is done for other tables. Then, this
causes all all of the existing workers to remain in a syncwait state,
further increasing the gap in WAL replay. Am I getting that right?
--
Michael
From | Date | Subject | |
---|---|---|---|
Next Message | torikoshia | 2023-10-18 06:50:39 | Re: pg_rewind WAL segments deletion pitfall |
Previous Message | Tom Lane | 2023-10-17 21:11:16 | Re: pg_dump needs SELECT privileges on irrelevant extension table |