Multiple full page writes in a single checkpoint?

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Multiple full page writes in a single checkpoint?
Date: 2021-02-03 23:05:56
Message-ID: 20210203230556.GB11069@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Cluster file encryption plans to use the LSN and page number as the
nonce for heap/index pages. I am looking into the use of a unique nonce
during hint bit changes. (You need to use a new nonce for re-encrypting
a page that changes.)

log_hint_bits already gives us a unique nonce for the first hint bit
change on a page during a checkpoint, but we only encrypt on page write
to the file system, so I am researching if log_hint_bits will already
generate a unique LSN for every page write to the file system, even if
there are multiple hint-bit-caused page writes to the file system during
a single checkpoint. (We already know this works for multiple
checkpoints.)

Our docs on full_page_writes states:

When this parameter is on, the
<productname>PostgreSQL</productname> server writes the entire
content of each disk page to WAL during the first modification
of that page after a checkpoint.

and wal_log_hints states:

When this parameter is <literal>on</literal>, the
<productname>PostgreSQL</productname> server writes the entire
content of each disk page to WAL during the first modification of
that page after a checkpoint, even for non-critical modifications
of so-called hint bits.

However, imagine these steps:

1. checkpoint starts
2. page is modified by row or hint bit change
3. page gets a new LSN and is marked as dirty
4. page image is flushed to WAL
5. pages is written to disk and marked as clean
6. page is modified by data or hint bit change
7. pages gets a new LSN and is marked as dirty
8. page image is flushed to WAL
9. checkpoint completes
10. pages is written to disk and marked as clean

Is the above case valid, and would it cause two full page writes to WAL?
More specifically, wouldn't it cause every write of the page to the file
system to use a new LSN?

If so, this means wal_log_hints is sufficient to guarantee a new nonce
for every page image, even for multiple hint bit changes and page writes
during a single checkpoint, and there is then no need for a hit bit
counter on the page --- the unique LSN does that for us. I know
log_hint_bits was designed to fix torn pages, but it seems to also do
exactly what cluster file encryption needs.

If the above is all true, should we update the docs, READMEs, or C
comments about this? I think the cluster file encryption patch would at
least need to document that we need to keep this behavior, because I
don't think log_hint_bits needs to behave this way for checksum
purposes because of the way full page writes are processed during crash
recovery.

--
Bruce Momjian <bruce(at)momjian(dot)us> https://momjian.us
EDB https://enterprisedb.com

The usefulness of a cup is in its emptiness, Bruce Lee

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2021-02-03 23:29:13 Re: Multiple full page writes in a single checkpoint?
Previous Message Robert Haas 2021-02-03 22:03:14 Re: new heapcheck contrib module