| From: | Andres Freund <andres(at)anarazel(dot)de> |
|---|---|
| To: | Melanie Plageman <melanieplageman(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Noah Misch <noah(at)leadboat(dot)com> |
| Cc: | Kirill Reshke <reshkekirill(at)gmail(dot)com>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com> |
| Subject: | Re: Buffer locking is special (hints, checksums, AIO writes) |
| Date: | 2026-02-02 22:33:10 |
| Message-ID: | 5ubipyssiju5twkb7zgqwdr7q2vhpkpmuelxfpanetlk6ofnop@hvxb4g2amb2d |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi,
On 2026-01-14 16:20:58 -0500, Andres Freund wrote:
> I'm now working on cleaning up the last two commits. The most crucial bit is
> to simplify what happens in MarkSharedBufferDirtyHint(), we afaict can delete
> the use of DELAY_CHKPT_START etc and just go to marking the buffer dirty first
> and then do the WAL logging, just like normal WAL logging. The previous order
> was only required because we were dirtying the page while holding only a
> shared lock, which did not conflict with the lock held by SyncBuffers() etc.
I've been working on that.
- A lot of what was special about MarkBufferDirtyHint() isn't needed anymore:
- The "abnormal" order of WAL logging before marking the buffer dirty was
only needed because we marked buffers dirty. Which in turn was only needed
because setting hint bits didn't conflict with flushing the page. With
share-exclusive they do conflict, and we can switch to the normal order of
operations, where marking a buffer dirty makes checkpoint wait when the
buffer is encountered (due to wanting to flush the buffer but not getting
the lock)
- Now that we use the normal order of WAL logging, we don't need to delay
checkpoint starts anymore.
I think the explanation for why that is ok is correct [1], but it needs to
be looked at by somebody with experience around this. Maybe Heikki?
- Thanks to holding share-exclusive lock, nothing can concurrently dirty or
undirty the buffer. Therefore the comments about spurious failures to mark
the buffer dirty can be removed.
- I realized that, now that buffers cannot be dirtied while IO is ongoing, we
don't need BM_JUST_DIRTIED anymore.
- The way MarkBufferDirtyHint() operates was copied into
heap_inplace_update_and_unlock(). Now that MarkBufferDirtyHint() won't work
that way anymore, it seems better to go with the alternative approach the
comments already outlined, namely to only delay updating of the buffer
contents.
I've done this in a prequisite commit, as it doesn't actually depend on any
of the other changes. Noah, any chance you could take a look at this?
- Lots of minor polish
Greetings,
Andres Freund
[1]
/*
* Update RedoRecPtr so that we can make the right decision. It's possible
* that a new checkpoint will start just after GetRedoRecPtr(), but that
* is ok, as the buffer is already dirty, ensuring that any BufferSync()
* started after the buffer was marked dirty cannot complete without
* flushing this buffer. If a checkpoint started between marking the
* buffer dirty and this check, we will emit an unnecessary WAL record (as
* the buffer will be written out as part of the checkpoint), but the
* window for that is small.
*/
| Attachment | Content-Type | Size |
|---|---|---|
| v12-0001-heapam-Don-t-mimic-MarkBufferDirtyHint-in-inplac.patch | text/x-diff | 4.0 KB |
| v12-0002-Require-share-exclusive-lock-to-set-hint-bits-an.patch | text/x-diff | 46.4 KB |
| v12-0003-bufmgr-Remove-the-now-obsolete-BM_JUST_DIRTIED.patch | text/x-diff | 6.4 KB |
| v12-0004-bufmgr-Switch-to-standard-order-in-MarkBufferDir.patch | text/x-diff | 6.9 KB |
| v12-0005-bufmgr-Don-t-copy-pages-while-writing-out.patch | text/x-diff | 10.9 KB |
| v12-0006-WIP-Make-UnlockReleaseBuffer-more-efficient.patch | text/x-diff | 3.5 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Melanie Plageman | 2026-02-02 22:47:23 | Re: eliminate xl_heap_visible to reduce WAL (and eventually set VM on-access) |
| Previous Message | Jelte Fennema-Nio | 2026-02-02 22:27:04 | Re: libpq: Bump protocol version to version 3.2 at least until the first/second beta |