Quick Links

Checkpointer sync queue fills up / loops around pg_usleep() are bad

From:	Andres Freund <andres(at)anarazel(dot)de>
To:	pgsql-hackers(at)postgresql(dot)org, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Subject:	Checkpointer sync queue fills up / loops around pg_usleep() are bad
Date:	2022-02-26 21:39:42
Message-ID:	20220226213942.nb7uvb2pamyu26dj@alap3.anarazel.de
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Hi,

In two recent investigations in occasional test failures
(019_replslot_limit.pl failures, AIO rebase) the problems are somehow tied to
checkpointer.

I don't yet know if actually causally related to precisely those failures, but
when running e.g. 027_stream_regress.pl, I see phases in which many backends
are looping in RegisterSyncRequest() repeatedly, each time sleeping with
pg_usleep(10000L).

Without adding instrumentation this is completely invisible at any log
level. There's no log messages, there's no wait events, nothing.

ISTM, we should not have any loops around pg_usleep(). And shorter term, we
shouldn't have any loops around pg_usleep() that don't emit log messages / set
wait events. Therefore I propose that we "prohibit" such loops without at
least a DEBUG2 elog() or so. It's just too hard to debug.

The reason for the sync queue filling up in 027_stream_regress.pl is actually
fairly simple:

1) The test runs with shared_buffers = 1MB, leading to a small sync queue of
128 entries.
2) CheckpointWriteDelay() does pg_usleep(100000L)

ForwardSyncRequest() wakes up the checkpointer using SetLatch() if the sync
queue is more than half full.

But at least on linux and freebsd that doesn't actually interrupt pg_usleep()
anymore (due to using signalfd / kqueue rather than a signal handler). And on
all platforms the signal might arrive just before the pg_usleep() rather than
during, also not causing usleep to be interrupted.

If I shorten the sleep in CheckpointWriteDelay() the problem goes away. This
actually reduces the time for a single run of 027_stream_regress.pl on my
workstation noticably. With default sleep time it's ~32s, with shortened time
it's ~27s.

I suspect we need to do something about this concrete problem for 14 and
master, because it's certainly worse than before on linux / freebsd.

I suspect the easiest is to just convert that usleep to a WaitLatch(). That'd
require adding a new enum value to WaitEventTimeout in 14. Which probably is
fine?

Greetings,

Andres Freund

Responses

Re: Checkpointer sync queue fills up / loops around pg_usleep() are bad at 2022-02-27 09:10:45 from Michael Paquier

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Chapman Flack	2022-02-26 22:03:04	Re: Postgres restart in the middle of exclusive backup and the presence of backup_label file
Previous Message	Greg Stark	2022-02-26 21:12:27	Re: Commitfest manager for 2022-03