Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug

From: hubert depesz lubaczewski <depesz(at)depesz(dot)com>
To: Thom Brown <thom(at)linux(dot)com>
Cc: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>, PostgreSQL General <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug
Date: 2025-08-21 10:03:20
Message-ID: aKbu6NaV5M0ifPn4@depesz.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Wed, Aug 20, 2025 at 06:30:00PM +0100, Thom Brown wrote:
> Do you have THP enabled? Can you use mpstat and see what %steal shows as?

Sorry, what is "THP"? I tried searching for "what is thp", and most
common search results are related to some chemical compound.

mpstat, yes, we have it. Let me quickly show what I get:

Two days ago, at ~ 10:10pm UTC I saw this in Pg logs:

# grep -oP '^2025-08-19 22:09:2\d\.\d+ UTC' postgresql-2025-08-19_220000.csv | uniq -c | grep -C3 -P '^\s*\d\d'
2 2025-08-19 22:09:29.084 UTC
1 2025-08-19 22:09:29.094 UTC
2 2025-08-19 22:09:29.097 UTC
70 2025-08-19 22:09:29.109 UTC
90 2025-08-19 22:09:29.110 UTC
6 2025-08-19 22:09:29.111 UTC
1 2025-08-19 22:09:29.153 UTC
1 2025-08-19 22:09:29.555 UTC

As you can see we have 70, and then 90 messages all logged with the same
timestamp. All of them (160) were "duration:x" lines.

At the time, mpstat (called using `mpstat 5`) looked like this:

CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle
22:09:04 all 2.25 0.00 0.27 0.24 0.00 0.10 0.00 0.00 0.00 97.14
22:09:09 all 2.70 0.00 0.28 0.27 0.00 0.12 0.00 0.00 0.00 96.63
22:09:14 all 2.83 0.00 0.63 0.23 0.00 0.09 0.00 0.00 0.00 96.22
22:09:19 all 2.66 0.00 0.28 0.23 0.00 0.10 0.00 0.00 0.00 96.74
22:09:24 all 2.36 0.00 0.27 0.21 0.00 0.10 0.00 0.00 0.00 97.06
22:09:29 all 2.49 0.00 0.40 0.40 0.00 0.11 0.00 0.00 0.00 96.60
22:09:34 all 2.65 0.00 0.28 0.23 0.00 0.11 0.00 0.00 0.00 96.73
22:09:39 all 2.26 0.00 0.26 0.21 0.00 0.09 0.00 0.00 0.00 97.19
22:09:44 all 2.31 0.00 0.25 0.22 0.00 0.09 0.00 0.00 0.00 97.13
22:09:49 all 2.44 0.00 0.22 0.24 0.00 0.08 0.00 0.00 0.00 97.02
22:09:54 all 2.23 0.00 0.23 0.28 0.00 0.10 0.00 0.00 0.00 97.16
22:09:59 all 2.48 0.00 0.26 0.30 0.00 0.11 0.00 0.00 0.00 96.85
22:10:04 all 2.29 0.00 0.28 0.26 0.00 0.12 0.00 0.00 0.00 97.06
22:10:09 all 2.31 0.00 0.22 0.21 0.00 0.11 0.00 0.00 0.00 97.14
22:10:14 all 2.38 0.00 0.25 0.21 0.00 0.10 0.00 0.00 0.00 97.05
22:10:19 all 2.54 0.00 0.24 0.20 0.00 0.11 0.00 0.00 0.00 96.91
22:10:24 all 2.26 0.00 0.25 0.22 0.00 0.10 0.00 0.00 0.00 97.17
22:10:29 all 2.12 0.00 0.27 0.24 0.00 0.09 0.00 0.00 0.00 97.27
22:10:34 all 2.56 0.00 0.26 0.26 0.00 0.12 0.00 0.00 0.00 96.80
22:10:39 all 2.16 0.00 0.23 0.24 0.00 0.10 0.00 0.00 0.00 97.27
22:10:44 all 2.37 0.00 0.26 0.22 0.00 0.10 0.00 0.00 0.00 97.05
22:10:49 all 2.25 0.00 0.25 0.61 0.00 0.09 0.00 0.00 0.00 96.80
22:10:54 all 2.41 0.00 0.28 0.22 0.00 0.10 0.00 0.00 0.00 96.99
22:10:59 all 2.83 0.00 0.29 0.19 0.00 0.12 0.00 0.00 0.00 96.57

Best regards,

depesz

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message hubert depesz lubaczewski 2025-08-21 10:07:06 Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug
Previous Message hubert depesz lubaczewski 2025-08-21 09:50:53 Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug