pgstattuple: Use streaming read API in pgstatindex functions

From: Xuneng Zhou <xunengzhou(at)gmail(dot)com>
To: pgsql-hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Subject: pgstattuple: Use streaming read API in pgstatindex functions
Date: 2025-10-13 02:07:07
Message-ID: CABPTF7UeN2o-trr9r7K76rZExnO2M4SLfvTfbUY2CwQjCekgnQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi hackers,

While reading the code related to streaming reads and their current
use cases, I noticed that pgstatindex could potentially benefit from
adopting the streaming read API. The required change is relatively
simple—similar to what has already been implemented in the pg_warm and
pg_visibility extensions. I also ran some performance tests on an
experimental patch to validate the improvement.

Summary
Cold cache performance (the typical use case for diagnostic tools):
- Medium indexes (~21MB): 1.21x - 1.79x faster (20-44% speedup)
- Large indexes (~214MB): 1.50x - 1.90x faster (30-47% speedup)
- Xlarge indexes (~1351MB):1.4x–1.9x gains. (29–47% speedup)

Hardware: AX162-R from hetzner

Test matrix:
- Index types: Primary key, timestamp, float, composite (3 columns)
- Sizes: Medium (1M rows, ~21MB), Large (10M rows, ~214MB), XLarge
(50M rows, ~ 1351MB))
- Layouts: Unfragmented (sequential) and Fragmented (random insert order)
- Cache states: Cold (dropped OS cache) and Warm (pg_prewarm)

Xlarge fragmented example:
==> Creating secondary indexes on test_xlarge
Created 3 secondary indexes: created_at, score, composite
Created test_xlarge_pkey: 1351 MB
Fragmentation stats (random insert order):
leaf_frag_pct | avg_density_pct | leaf_pages | size
---------------+-----------------+------------+---------
49.9 | 71.5 | 172272 | 1351 MB
(1 row)

configuration:
- shared_buffers = 16GB
- effective_io_concurrency = 500
- io_combine_limit = 16
- autovacuum = off
- checkpoint_timeout = 1h
- bgwriter_delay = 10000ms (minimize background writes)
- jit = off
- max_parallel_workers_per_gather = 0

Unfragmented Indexes (Cold Cache)

Index Type    Size  Baseline  Patched  Speedup
Primary Key    Medium  31.5 ms  19.6 ms  1.58×
Primary Key    Large  184.0 ms  119.0 ms  1.54×
Timestamp     Medium  13.4 ms  10.5 ms  1.28×
Timestamp     Large  85.0 ms  45.6 ms  1.86×
Float (score)   Medium  13.7 ms  11.4 ms  1.21×
Float (score)   Large  84.0 ms  45.0 ms  1.86×
Composite (3 col) Medium  26.7 ms  17.2 ms  1.56×
Composite (3 col) Large  89.8 ms  51.2 ms  1.75×

Fragmented Indexes (Cold Cache)

To address concerns about filesystem fragmentation, I tested indexes built
with random inserts (ORDER BY random()) to trigger page splits and create
fragmented indexes:

Index Type    Size  Baseline  Patched  Speedup
Primary Key    Medium  41.9 ms  23.5 ms  1.79×
Primary Key    Large  236.0 ms  148.0 ms  1.58×
Primary Key    XLarge  953.4 ms  663.1 ms  1.43×
Timestamp     Medium  32.1 ms  18.8 ms  1.70×
Timestamp     Large  188.0 ms  117.0 ms  1.59×
Timestamp     XLarge  493.0 ms  518.6 ms  0.95×
Float (score)   Medium  14.0 ms  10.9 ms  1.28×
Float (score)   Large  85.8 ms  45.2 ms  1.89×
Float (score)   XLarge  263.2 ms  176.5 ms  1.49×
Composite (3 col) Medium  42.3 ms  24.1 ms  1.75×
Composite (3 col) Large  245.0 ms  162.0 ms  1.51×
Composite (3 col) XLarge  1052.5 ms  716.5 ms  1.46×

Summary: Fragmentation generally does not hurt streaming reads; most
fragmented cases still see 1.4×–1.9× gains. One outlier (XLarge
Timestamp) shows a slight regression (0.95×).

Warm Cache Results
When indexes are fully cached in shared_buffers:
Unfragmented: infrequent little regression for small to medium size
index(single digit ms variance, barely noticeable); small gains for
large size index
Fragmented: infrequent little regression for small to medium size
index(single digit ms variance, barely noticeable); small gains for
large size index

Best,
Xuneng

Attachment Content-Type Size
0001-pgstattuple-Use-streaming-read-API-in-pgstatindex-fu.patch application/x-patch 7.5 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Xuneng Zhou 2025-10-13 03:20:19 Re: pgstattuple: Use streaming read API in pgstatindex functions
Previous Message Tom Lane 2025-10-13 01:54:11 Re: Failure building libpq v18.0 on old aarch64