Re: bgwriter and checkpoints

From: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
To: Henry Francisco Garcia Cortez <garcortez(at)gmail(dot)com>
Cc: "pgsql-admin(at)postgresql(dot)org" <pgsql-admin(at)postgresql(dot)org>, Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, "alvherre(at)2ndquadrant(dot)com" <alvherre(at)2ndquadrant(dot)com>
Subject: Re: bgwriter and checkpoints
Date: 2021-09-03 22:16:24
Message-ID: CAKFQuwZV2ZGexz08pVu9CgO+OeC4V4rtvv0LqNaRimmk04yfmA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

On Friday, September 3, 2021, Henry Francisco Garcia Cortez <
garcortez(at)gmail(dot)com> wrote:

> Hi community, I was reading about background writer process and
> checkpoints, but I still don't understand very well. How do they work
>

At a basic level the database is comprised of files representing its
current data (tables) and a record of all changes (wal).

For performance, the current data is kept in volatile memory. Since losing
data is not an option something (the WAL) has to be written to disk.

Now, it is not feasible for the current data to only exist in RAM, so there
are processes that save the current data to disk as well. The bgwriter and
checkpointer processes do this. Each has different goals and
configurations but they work together and the nuances are outside my scope
here.

Since we are recording the current data to disk, at some point the changes
that resulted in that particular current data state become obsolete. A
checkpoint record is added to the WAL at locations where it is known that
all previous change data is obsolete due to the writing of current data to
disk. This happens periodically while the server is running, controlled by
various configuration variables. A user may also issue a checkpoint SQL
command to have the server record all current data to disk immediately so
that a new checkpoint record can be written to the WAL.

Shared buffers are the table files (fragments) that are in volatile memory
while the heap refers to those tables files on persistent disk.

During crash recovery the current data on disk is taken as the starting
point, and then all changes since the last checkpoint are applied to bring
the system up to the current data state that was in volatile memory at the
time of the crash. Some of the post-final-checkpoint changes may have
already made it to the current data disk files, that’s ok, reapplying those
changes, so long as done in sequential order, will result in a correct
final state.

Side note: The last paragraph also means that during crash recovery one
cannot stop the change application process early - those future changes are
in the disk table files and cannot be undone, which would have to happen if
stopping at a time before they would have occurred.

David J.

In response to

Browse pgsql-admin by date

  From Date Subject
Next Message Michael Paquier 2021-09-06 02:27:35 Re: Estimating HugePages Requirements?
Previous Message Alvaro Herrera 2021-09-03 21:22:09 Re: bgwriter and checkpoints