From: | Xuneng Zhou <xunengzhou(at)gmail(dot)com> |
---|---|
To: | pgsql-bugs(at)lists(dot)postgresql(dot)org |
Subject: | Re: BUG #19006: Assert(BufferIsPinned) in BufferGetBlockNumber() is triggered for forwarded buffer |
Date: | 2025-08-22 03:25:47 |
Message-ID: | CABPTF7U=1G_3fGxDcS_YceWu7zRGi+w-zVNzc9H7ux00cErEVA@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
Hi,
The mailing list seems to prefer compressed attachments over many
scattered ones.
On Fri, Aug 22, 2025 at 11:08 AM Xuneng Zhou <xunengzhou(at)gmail(dot)com> wrote:
>
> Hi,
>
> XXX does the change have any measurable performance impact?
> I did some preliminary performance tests on this patch and Head. Not
> sure whether the methodology of tests is right, still post some
> findings in here:
>
> 1) Query runtimes (EXPLAIN (ANALYZE, BUFFERS), ms)
>
> Bitmap heap scan
> • HEAD (5 reps): [94.121, 89.600, 92.732, 78.383, 78.068]
> mean 86.58 ms, median 89.60, min 78.07, max 94.12. Buffers: shared hit only.
> • PATCHED (5 reps): [89.130, 80.375, 88.227, 98.661, 88.113]
> mean 88.90 ms, median 88.227, min 80.38, max 98.66. Buffers: shared hit only.
>
> Delta (PATCHED vs HEAD): mean +2.68% (slower), median −1.53% (faster).
>
> Sequential scan
> • HEAD (5 reps): [2718.212, 2691.739, 3253.613, 3060.852, 2963.142]
> mean 2937.51 ms, median 2963.14.
> • PATCHED (5 reps): [2719.971, 2770.082, 2697.245, 2610.932, 2662.142]
> mean 2692.07 ms, median 2697.25. Plans + buffer counts shown in the
> attached file (example snippets include Buffers: shared hit ~600–900,
> read ~343–344k; I/O Timings ~10–16 ms).
>
> Delta (PATCHED vs HEAD): mean −8.36%, median −8.97%.
>
>
> 2) CPU micro-latencies (bpftrace) — hot read-stream functions
>
> Functions: StartReadBuffers, read_stream_start_pending_read,
> read_stream_next_buffer. Averages are µs; totals are over the ~60 s
> run.
>
> Bitmap
> • HEAD — calls: SRB 2,697,243, start_pending 2,696,591, next_buffer 2,697,105;
> avg_us: SRB ≈4, start_pending ≈5, next_buffer ≈10.
> • PATCHED — calls: SRB 2,677,563, start_pending 2,676,940, next_buffer
> 2,677,477;
> avg_us: SRB ≈4, start_pending ≈5, next_buffer ≈10.
>
> Seq (control)
> • HEAD — calls: SRB 712,927, start_pending 2,042,992, next_buffer 5,528,071;
> avg_us: SRB ≈8, start_pending ≈3, next_buffer ≈6.
> • PATCHED — calls: SRB 727,083, start_pending 2,083,866, next_buffer 5,636,554;
> avg_us: SRB ≈8, start_pending ≈3, next_buffer ≈6.
>
> Read: Per-call costs are indistinguishable. Small call-count drift is
> noise-level and doesn’t suggest a behavioral change.
>
>
> 3) perf stat (-d) — counters over ~60 s (single backend)
>
> Key: task-clock (ms), CPU util (=task-clock/elapsed), cycles,
> instructions, IPC, branch-miss rate, L1D miss rate.
>
> Bitmap
> • HEAD → PATCHED (Δ)
> • task-clock: 8528.28 → 8542.37 ms (+0.17%)
> • CPU util: 0.1421 → 0.1424 (+0.16%)
> • cycles: 16.844 G → 16.796 G (−0.28%)
> • instructions: 51.161 G → 51.435 G (+0.54%)
> • IPC: 3.037 → 3.062 (+0.82%)
> • branch-miss rate: 0.125% → 0.115% (−7.6%)
> • L1D miss rate: 0.379% → 0.383% (+1.1%)
>
> Read: CPU-side efficiency is neutral-to-slightly better with the patch
> (↑IPC, ↓cycles, ↓branch-miss), matching the bpftrace picture.
>
> Seq (control)
>
> • HEAD → PATCHED (Δ)
> • task-clock: 10,862.53 → 8,896.26 ms (−18.1%)
> • CPU util: 0.181 → 0.148 (−18.1%)
> • cycles: 16.275 G → 13.049 G (−19.8%)
> • instructions: 39.782 G → 33.343 G (−16.2%)
> • IPC: 2.444 → 2.555 (+4.5%)
> • branch-miss rate: 0.380% → 0.350% (−8.0%)
> • L1D miss rate: 1.009% → 0.997% (−1.2%)
>
> Read: Control path remains healthy: IPC improves; miss rates improve slightly.
>
>
> Takeaways
> • The new *npinned interface and removal of “forwarded buffer”
> bookkeeping appears neutral-to-positive under warm-cache conditions.
> • Bitmap workload: unchanged micro-latencies, modest CPU-efficiency
> uplift (↑IPC ~0.8%, ↓cycles), and runtimes that straddle zero with
> identical plans/buffer behavior.
> • Seq control: unchanged micro-latencies; perf shows higher IPC and
> slightly better miss rates; EXPLAIN shows a modest runtime improvement
> (~9% median).
>
> Best,
> Xuneng
Attachment | Content-Type | Size |
---|---|---|
perftests.zip | application/zip | 169.4 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Richard Guo | 2025-08-22 07:25:25 | Re: BUG #19007: Planner fails to choose partial index with spurious 'not null' |
Previous Message | Thomas Munro | 2025-08-22 02:16:40 | Re: BUG #19028: INITDB fails post-bootstrap initialization with FATAL " " is not a valid binary digit at character 1 |