Quick Links

Re: Checkpointer sync queue fills up / loops around pg_usleep() are bad

From:	Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To:	Andres Freund <andres(at)anarazel(dot)de>
Cc:	PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Checkpointer sync queue fills up / loops around pg_usleep() are bad
Date:	2022-03-01 17:46:23
Message-ID:	CA+hUKGLRtjhGWB-dd_B8z6agJaFmfxVTiSyqnka937ss3+VywQ@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Mon, Feb 28, 2022 at 2:36 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> On February 27, 2022 4:19:21 PM PST, Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
> >It seems a little strange to introduce a new wait event that will very
> >often appear into a stable branch, but ... it is actually telling the
> >truth, so there is that.
>
> In the back branches it needs to be at the end of the enum - I assume you intended that just to be for HEAD.

Yeah.

> I wonder whether in HEAD we shouldn't make that sleep duration be computed from the calculation in IsOnSchedule...

I might look into this.

> >The sleep/poll loop in RegisterSyncRequest() may also have another
> >problem. The comment explains that it was a deliberate choice not to
> >do CHECK_FOR_INTERRUPTS() here, which may be debatable, but I don't
> >think there's an excuse to ignore postmaster death in a loop that
> >presumably becomes infinite if the checkpointer exits. I guess we
> >could do:
> >
> >- pg_usleep(10000L);
> >+ WaitLatch(NULL, WL_EXIT_ON_PM_DEATH | WL_TIMEOUT, 10,
> >WAIT_EVENT_SYNC_REQUEST);
> >
> >But... really, this should be waiting on a condition variable that the
> >checkpointer broadcasts on when the queue goes from full to not full,
> >no? Perhaps for master only?
>
> Looks worth improving, but yes, I'd not do it in the back branches.

0003 is a first attempt at that, for master only (on top of 0002 which
is the minimal fix). This shaves another second off
027_stream_regress.pl on my workstation. The main thing I realised is
that I needed to hold interrupts while waiting, which seems like it
should go away with 'tombstone' files as discussed in other threads.
That's not a new problem in this patch, it just looks more offensive
to the eye when you spell it out, instead of hiding it with an
unreported sleep/poll loop...

> I do think it's worth giving that sleep a proper wait event though, even in the back branches.

I'm thinking that 0002 should be back-patched all the way, but 0001
could be limited to 14.

Attachment	Content-Type	Size
v2-0001-Wake-up-for-latches-in-CheckpointWriteDelay.patch	text/x-patch	3.8 KB
v2-0002-Fix-waiting-in-RegisterSyncRequest.patch	text/x-patch	3.3 KB
v2-0003-Use-condition-variable-to-wait-when-sync-request-.patch	text/x-patch	10.1 KB

In response to

Re: Checkpointer sync queue fills up / loops around pg_usleep() are bad at 2022-02-28 01:36:20 from Andres Freund

Responses

Re: Checkpointer sync queue fills up / loops around pg_usleep() are bad at 2022-03-01 21:58:48 from Andres Freund

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Nitin Jadhav	2022-03-01 18:12:02	Refactor statistics collector, backend status reporting and command progress reporting
Previous Message	Bharath Rupireddy	2022-03-01 17:39:57	Re: Allow async standbys wait for sync replication