From: | Xuneng Zhou <xunengzhou(at)gmail(dot)com> |
---|---|
To: | Michael Paquier <michael(at)paquier(dot)xyz> |
Cc: | pgsql-hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi> |
Subject: | Re: Improve read_local_xlog_page_guts by replacing polling with latch-based waiting |
Date: | 2025-10-15 08:43:29 |
Message-ID: | CABPTF7XNtg5vh8hJkcv5tnBRtbVzZLP9MQVyUbpK=zAxhczUjw@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On Wed, Oct 15, 2025 at 8:31 AM Xuneng Zhou <xunengzhou(at)gmail(dot)com> wrote:
>
> Hi,
>
> On Sat, Oct 11, 2025 at 11:02 AM Xuneng Zhou <xunengzhou(at)gmail(dot)com> wrote:
> >
> > Hi,
> >
> > The following is the split patch set. There are certain limitations to
> > this simplification effort, particularly in patch 2. The
> > read_local_xlog_page_guts callback demands more functionality from the
> > facility than the WAIT FOR patch — specifically, it must wait for WAL
> > flush events, though it does not require timeout handling. In some
> > sense, parts of patch 3 can be viewed as a superset of the WAIT FOR
> > patch, since it installs wake-up hooks in more locations. Unlike the
> > WAIT FOR patch, which only needs wake-ups triggered by replay,
> > read_local_xlog_page_guts must also handle wake-ups triggered by WAL
> > flushes.
> >
> > Workload characteristics play a key role here. A sorted dlist performs
> > well when insertions and removals occur in order, achieving O(1)
> > complexity in the best case. In synchronous replication, insertion
> > patterns seem generally monotonic with commit LSNs, though not
> > strictly ordered due to timing variations and contention. When most
> > insertions remain ordered, a dlist can be efficient. However, as the
> > number of elements grows and out-of-order insertions become more
> > frequent, the insertion cost can degrade to O(n) more often.
> >
> > By contrast, a pairing heap maintains stable O(1) insertion for both
> > ordered and disordered inputs, with amortized O(log n) removals. Since
> > LSNs in the WAIT FOR command are likely to arrive in a non-sequential
> > fashion, the pairing heap introduced in v6 provides more predictable
> > performance under such workloads.
> >
> > At this stage (v7), no consolidation between syncrep and xlogwait has
> > been implemented. This is mainly because the dlist and pairing heap
> > each works well under different workloads — neither is likely to be
> > universally optimal. Introducing the facility with a pairing heap
> > first seems reasonable, as it offers flexibility for future
> > refactoring: we could later replace dlist with a heap or adopt a
> > modular design depending on observed workload characteristics.
> >
>
> v8-0002 removed the early fast check before addLSNWaiter in WaitForLSNReplay,
> as the likelihood of a server state change is small compared to the
> branching cost and added code complexity.
>
Made minor changes to #include of xlogwait.h in patch2 to calm CF-bots down.
Best,
Xuneng
Attachment | Content-Type | Size |
---|---|---|
v9-0003-Improve-read_local_xlog_page_guts-by-replacing-po.patch | application/octet-stream | 9.0 KB |
v9-0001-Add-pairingheap_initialize-for-shared-memory-usag.patch | application/octet-stream | 3.0 KB |
v9-0002-Add-infrastructure-for-efficient-LSN-waiting.patch | application/octet-stream | 24.7 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Алена Васильева | 2025-10-15 08:50:21 | [PATCH] Handle out-of-range timestamps in timestamptz_to_str() |
Previous Message | Xuneng Zhou | 2025-10-15 08:40:03 | Re: Implement waiting for wal lsn replay: reloaded |