Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: depesz(at)depesz(dot)com
Cc: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>, PostgreSQL General <pgsql-general(at)lists(dot)postgresql(dot)org>, Chris Wilson <chris+google(at)qwirx(dot)com>
Subject: Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug
Date: 2025-08-22 15:55:18
Message-ID: 1885947.1755878118@sss.pgh.pa.us
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-general

hubert depesz lubaczewski <depesz(at)depesz(dot)com> writes:
> On Fri, Aug 22, 2025 at 11:21:22AM -0400, Tom Lane wrote:
>> Interesting. That futex call is presumably caused by interaction
>> with some other process within the standby server, and the only
>> plausible candidate really is the startup process (which is replaying
>> WAL received from the primary). There are cases where WAL replay
>> will take locks that can block queries on the standby. Can you
>> correlate the delays on the standby server with any DDL events
>> occurring on the primary?

> Nope. Plus there is certain repetition of these cases, so even if I'd
> miss *some* create table/alter, it just isn't going to be happening
> every 4-5 minutes.

Nonetheless, I'm suspecting an interaction with the startup process,
because there just isn't that much else that this process could be
needing to deal with. Can you try strace'ing both the process doing
the test query and the startup process, to see what the startup
process is doing at the times the futex calls happen?

regards, tom lane

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message hubert depesz lubaczewski 2025-08-22 16:01:50 Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug
Previous Message hubert depesz lubaczewski 2025-08-22 15:42:08 Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug