Re: Checkpointer sync queue fills up / loops around pg_usleep() are bad

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Julien Rouhaud <rjuju123(at)gmail(dot)com>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Checkpointer sync queue fills up / loops around pg_usleep() are bad
Date: 2022-02-28 00:19:21
Message-ID: CA+hUKG+Fug3=Hw2dAiVEMqbC-F_1tKNo_Ev5q6-5Ww7aUGKDWw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Feb 27, 2022 at 10:29 PM Julien Rouhaud <rjuju123(at)gmail(dot)com> wrote:
> On Sun, Feb 27, 2022 at 06:10:45PM +0900, Michael Paquier wrote:
> > On Sat, Feb 26, 2022 at 01:39:42PM -0800, Andres Freund wrote:
> > > I suspect the easiest is to just convert that usleep to a WaitLatch(). That'd
> > > require adding a new enum value to WaitEventTimeout in 14. Which probably is
> > > fine?
> >
> > We've added wait events in back-branches in the past, so this does not
> > strike me as a problem as long as you add the new entry at the end of
> > the enum, while keeping things ordered on HEAD.
>
> +1

+1

Sleeps like these are also really bad for ProcSignalBarrier, which I
was expecting to be the impetus for fixing any remaining loops like
this.

With the attached, 027_stream_regress.pl drops from ~29.5s to ~19.6s
on my FreeBSD workstation!

It seems a little strange to introduce a new wait event that will very
often appear into a stable branch, but ... it is actually telling the
truth, so there is that.

The sleep/poll loop in RegisterSyncRequest() may also have another
problem. The comment explains that it was a deliberate choice not to
do CHECK_FOR_INTERRUPTS() here, which may be debatable, but I don't
think there's an excuse to ignore postmaster death in a loop that
presumably becomes infinite if the checkpointer exits. I guess we
could do:

- pg_usleep(10000L);
+ WaitLatch(NULL, WL_EXIT_ON_PM_DEATH | WL_TIMEOUT, 10,
WAIT_EVENT_SYNC_REQUEST);

But... really, this should be waiting on a condition variable that the
checkpointer broadcasts on when the queue goes from full to not full,
no? Perhaps for master only?

Attachment Content-Type Size
0001-Wake-up-for-latches-in-CheckpointWriteDelay.patch text/x-patch 3.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Japin Li 2022-02-28 01:04:50 Re: CREATE DATABASE IF NOT EXISTS in PostgreSQL
Previous Message Thomas Munro 2022-02-28 00:00:26 Re: Missed condition-variable wakeups on FreeBSD