From: | Xuneng Zhou <xunengzhou(at)gmail(dot)com> |
---|---|
To: | Michael Paquier <michael(at)paquier(dot)xyz> |
Cc: | pgsql-hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi> |
Subject: | Re: Improve read_local_xlog_page_guts by replacing polling with latch-based waiting |
Date: | 2025-10-11 03:02:51 |
Message-ID: | CABPTF7WuFr6Z7zPMoqgk4BCLs8uA1ihCSDKEq1wbxJJB4Qy+Sg@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
The following is the split patch set. There are certain limitations to
this simplification effort, particularly in patch 2. The
read_local_xlog_page_guts callback demands more functionality from the
facility than the WAIT FOR patch — specifically, it must wait for WAL
flush events, though it does not require timeout handling. In some
sense, parts of patch 3 can be viewed as a superset of the WAIT FOR
patch, since it installs wake-up hooks in more locations. Unlike the
WAIT FOR patch, which only needs wake-ups triggered by replay,
read_local_xlog_page_guts must also handle wake-ups triggered by WAL
flushes.
Workload characteristics play a key role here. A sorted dlist performs
well when insertions and removals occur in order, achieving O(1)
complexity in the best case. In synchronous replication, insertion
patterns seem generally monotonic with commit LSNs, though not
strictly ordered due to timing variations and contention. When most
insertions remain ordered, a dlist can be efficient. However, as the
number of elements grows and out-of-order insertions become more
frequent, the insertion cost can degrade to O(n) more often.
By contrast, a pairing heap maintains stable O(1) insertion for both
ordered and disordered inputs, with amortized O(log n) removals. Since
LSNs in the WAIT FOR command are likely to arrive in a non-sequential
fashion, the pairing heap introduced in v6 provides more predictable
performance under such workloads.
At this stage (v7), no consolidation between syncrep and xlogwait has
been implemented. This is mainly because the dlist and pairing heap
each works well under different workloads — neither is likely to be
universally optimal. Introducing the facility with a pairing heap
first seems reasonable, as it offers flexibility for future
refactoring: we could later replace dlist with a heap or adopt a
modular design depending on observed workload characteristics.
Best,
Xuneng
Attachment | Content-Type | Size |
---|---|---|
v7-0002-Add-infrastructure-for-efficient-LSN-waiting.patch | application/octet-stream | 25.5 KB |
v7-0001-Add-pairingheap_initialize-for-shared-memory-usag.patch | application/octet-stream | 3.0 KB |
v7-0003-Improve-read_local_xlog_page_guts-by-replacing-po.patch | application/octet-stream | 9.0 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | jian he | 2025-10-11 05:08:08 | Re: create table like including storage parameter |
Previous Message | Tom Lane | 2025-10-11 02:22:16 | Re: [PING] [PATCH v2] parallel pg_restore: avoid disk seeks when jumping short distance forward |