Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)

From: Melanie Plageman <melanieplageman(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Justin Pryzby <pryzby(at)telsasoft(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, tgl(at)sss(dot)pgh(dot)pa(dot)us, vignesh21(at)gmail(dot)com, lukas(at)fittl(dot)com, alvherre(at)alvh(dot)no-ip(dot)org, magnus(at)hagander(dot)net, pgsql-hackers(at)postgresql(dot)org, thomas(dot)munro(at)gmail(dot)com, m(dot)sakrejda(at)gmail(dot)com
Subject: Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)
Date: 2023-03-10 19:51:13
Message-ID: CAAKRu_boi6g8Jj82RWnrKR=VbbEK+Be=ASjPv8PePWbDigjCPA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Mar 9, 2023 at 2:43 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> On 2023-03-09 06:51:31 -0600, Justin Pryzby wrote:
> > On Tue, Mar 07, 2023 at 10:18:44AM -0800, Andres Freund wrote:
> > There's a 2nd portion of the test that's still flapping, at least on
> > cirrusci.
> >
> > The issue that Tom mentioned is at:
> > SELECT :io_sum_shared_after_writes > :io_sum_shared_before_writes;
> >
> > But what I've seen on cirrusci is at:
> > SELECT :io_sum_shared_after_writes > :io_sum_shared_before_writes;
>
> Seems you meant to copy a different line for Tom's (s/writes/redas/)?
>
>
> > https://api.cirrus-ci.com/v1/artifact/task/6701069548388352/log/src/test/recovery/tmp_check/regression.diffs
>
> Hm. I guess the explanation here is that the buffers were already all written
> out by another backend. Which is made more likely by your patch.
>
>
> I found a few more occurances and chatted with Melanie. Melanie will come up
> with a fix I think.

So, what this test is relying on is that either the checkpointer or
another backend will flush the pages of test_io_shared which we dirtied
above in the test. The test specifically checks for IOCONTEXT_NORMAL
writes. It could fail if some other backend is doing a bulkread or
bulkwrite and flushes these buffers first in a strategy context.
This will happen more often when shared buffers is small.

I tried to come up with a reliable test which was limited to
IOCONTEXT_NORMAL. I thought if we could guarantee a dirty buffer would
be pinned using a cursor, that we could then issue a checkpoint and
guarantee a flush that way. However, I don't see a way to guarantee that
no one flushes the buffer between dirtying it and pinning it with the
cursor.

So, I think our best bet is to just change the test to pass if there are
any writes in any contexts. By moving the sum(writes) before the INSERT
and keeping the checkpoint, we can guarantee that someway or another,
some buffers will be flushed. This essentially covers the same code anyway.

Patch attached.

- Melanie

Attachment Content-Type Size
Stabilize-pg_stat_io-writes-test.patch text/x-patch 3.7 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2023-03-10 20:07:05 Re: Ability to reference other extensions by schema in extension scripts
Previous Message Jeff Davis 2023-03-10 18:58:49 Re: pgsql: Use ICU by default at initdb time.