Re: What does "[backends] should seldom or never need to wait for a write to occur" mean?

From: Chris Wilson <chris+google(at)qwirx(dot)com>
To: Chris Wilson <chris+google(at)qwirx(dot)com>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Pg Docs <pgsql-docs(at)lists(dot)postgresql(dot)org>
Subject: Re: What does "[backends] should seldom or never need to wait for a write to occur" mean?
Date: 2020-11-03 18:11:21
Message-ID: CAOg7f80WL7cR1TgXXCzzXYcNtfjgAkkc+zXr9T9tshy8jBSZLw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-docs

Hi all,

I did some more research and found this explanation in a presentation by
2ndQuadrant
<https://www.2ndquadrant.com/wp-content/uploads/2019/05/Inside-the-PostgreSQL-Shared-Buffer-Cache.pdf>
:

When a process wants a buffer, it asks BufferAlloc for the file/block. If
> the block is already cached, it gets pinned and then returned. Otherwise, a
> new buffer must be found to hold this data. If there are no buffers free
> (there usually aren’t) BufferAlloc selects a buffer to evict to make space
> for the new one. If that page is dirty, it is written out to disk. This can
> cause the backend trying to allocate that buffer to block as it waits for
> that write I/O to complete.

So it seems that both reads and writes can potentially have to wait for
I/O. And the bgwriter reduces the risk of hitting a dirty page and needing
to write it before evicting.

So perhaps the documentation should say:

"There is a separate server process called the background writer, whose
function is to issue writes of “dirty” (new or modified) shared buffers.
This reduces the chances that a backend needing an empty buffer must write
a dirty one back to disk before evicting it."

Thanks, Chris.

On Mon, 2 Nov 2020 at 12:38, Chris Wilson <chris+google(at)qwirx(dot)com> wrote:

> Hi all,
>
> Thanks Thomas.
>
> When the bgwriter flushes (cleans) a dirty Postgres buffer, it generates a
> write() syscall of its own, which I think must increase the number of dirty
> cache buffers in the Linux kernel (temporarily, until it actually flushes
> those cache buffers to disk). Therefore it temporarily increases the risk
> of a write stall (in any process, not just Postgres backends), is that
> correct?
>
> I suppose that if dirty buffers are being cleaned regularly, then it
> reduces the risk that (1) a Postgres backend which is writing (dirtying
> buffers) suddenly needs an empty buffer when there are no clean buffers to
> evict, so it needs to flush a dirty one and (2) the resulting write()
> syscall would take the kernel over its background dirty limit, so the
> kernel must flush it immediately, and make the backend wait. By that
> mechanism I can see that it might reduce the chance of backends having to
> wait, but by writing more in general (as above) it could also increase it.
>
> So when it says "It writes shared buffers so server processes handling
> user queries seldom or never need to wait for a write to occur", is that
> really justified, or is that sentence incorrect and we should remove it? Or
> have I missed something?
>
> Thanks, Chris.
>
> On Sun, 1 Nov 2020 at 21:00, Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
>
>> On Fri, Oct 30, 2020 at 11:24 AM PG Doc comments form
>> <noreply(at)postgresql(dot)org> wrote:
>> > The following documentation comment has been logged on the website:
>> >
>> > Page: https://www.postgresql.org/docs/13/runtime-config-resource.html
>> > Description:
>> >
>> >
>> https://www.postgresql.org/docs/13/runtime-config-resource.html#RUNTIME-CONFIG-RESOURCE-BACKGROUND-WRITER
>> >
>> > says:
>> >
>> > "There is a separate server process called the background writer, whose
>> > function is to issue writes of “dirty” (new or modified) shared
>> buffers. It
>> > writes shared buffers so server processes handling user queries seldom
>> or
>> > never need to wait for a write to occur."
>> >
>> > It's not clear what "wait for a write to occur" means: a write()
>> syscall or
>> > an fsync() syscall?
>>
>> It means pwrite(). That could block if your kernel cache is swamped,
>> but hopefully it just copies the data into the kernel and returns.
>> There is an fsync() call, but it's usually queued up for handling by
>> the checkpointer process some time later.
>>
>

In response to

Responses

Browse pgsql-docs by date

  From Date Subject
Next Message PG Doc comments form 2020-11-04 00:48:24 On what system does the postgresql docs run?
Previous Message Chris Wilson 2020-11-02 12:38:03 Re: What does "[backends] should seldom or never need to wait for a write to occur" mean?