From: | Xuneng Zhou <xunengzhou(at)gmail(dot)com> |
---|---|
To: | pgsql-hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Cc: | Thomas Munro <thomas(dot)munro(at)gmail(dot)com> |
Subject: | Re: pgstattuple: Use streaming read API in pgstatindex functions |
Date: | 2025-10-13 03:20:19 |
Message-ID: | CABPTF7V8kVJcEsYKA94h4xBQAvY6shKwW98c=Wa2CuRnNYU8ow@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On Mon, Oct 13, 2025 at 10:07 AM Xuneng Zhou <xunengzhou(at)gmail(dot)com> wrote:
>
> Hi hackers,
>
> While reading the code related to streaming reads and their current
> use cases, I noticed that pgstatindex could potentially benefit from
> adopting the streaming read API. The required change is relatively
> simple—similar to what has already been implemented in the pg_warm and
> pg_visibility extensions. I also ran some performance tests on an
> experimental patch to validate the improvement.
>
> Summary
> Cold cache performance (the typical use case for diagnostic tools):
> - Medium indexes (~21MB): 1.21x - 1.79x faster (20-44% speedup)
> - Large indexes (~214MB): 1.50x - 1.90x faster (30-47% speedup)
> - Xlarge indexes (~1351MB):1.4x–1.9x gains. (29–47% speedup)
>
> Hardware: AX162-R from hetzner
>
> Test matrix:
> - Index types: Primary key, timestamp, float, composite (3 columns)
> - Sizes: Medium (1M rows, ~21MB), Large (10M rows, ~214MB), XLarge
> (50M rows, ~ 1351MB))
> - Layouts: Unfragmented (sequential) and Fragmented (random insert order)
> - Cache states: Cold (dropped OS cache) and Warm (pg_prewarm)
>
> Xlarge fragmented example:
> ==> Creating secondary indexes on test_xlarge
> Created 3 secondary indexes: created_at, score, composite
> Created test_xlarge_pkey: 1351 MB
> Fragmentation stats (random insert order):
> leaf_frag_pct | avg_density_pct | leaf_pages | size
> ---------------+-----------------+------------+---------
> 49.9 | 71.5 | 172272 | 1351 MB
> (1 row)
>
> configuration:
> - shared_buffers = 16GB
> - effective_io_concurrency = 500
> - io_combine_limit = 16
> - autovacuum = off
> - checkpoint_timeout = 1h
> - bgwriter_delay = 10000ms (minimize background writes)
> - jit = off
> - max_parallel_workers_per_gather = 0
>
> Unfragmented Indexes (Cold Cache)
>
> Index Type Size Baseline Patched Speedup
> Primary Key Medium 31.5 ms 19.6 ms 1.58×
> Primary Key Large 184.0 ms 119.0 ms 1.54×
> Timestamp Medium 13.4 ms 10.5 ms 1.28×
> Timestamp Large 85.0 ms 45.6 ms 1.86×
> Float (score) Medium 13.7 ms 11.4 ms 1.21×
> Float (score) Large 84.0 ms 45.0 ms 1.86×
> Composite (3 col) Medium 26.7 ms 17.2 ms 1.56×
> Composite (3 col) Large 89.8 ms 51.2 ms 1.75×
>
> ⸻
>
> Fragmented Indexes (Cold Cache)
>
> To address concerns about filesystem fragmentation, I tested indexes built
> with random inserts (ORDER BY random()) to trigger page splits and create
> fragmented indexes:
>
> Index Type Size Baseline Patched Speedup
> Primary Key Medium 41.9 ms 23.5 ms 1.79×
> Primary Key Large 236.0 ms 148.0 ms 1.58×
> Primary Key XLarge 953.4 ms 663.1 ms 1.43×
> Timestamp Medium 32.1 ms 18.8 ms 1.70×
> Timestamp Large 188.0 ms 117.0 ms 1.59×
> Timestamp XLarge 493.0 ms 518.6 ms 0.95×
> Float (score) Medium 14.0 ms 10.9 ms 1.28×
> Float (score) Large 85.8 ms 45.2 ms 1.89×
> Float (score) XLarge 263.2 ms 176.5 ms 1.49×
> Composite (3 col) Medium 42.3 ms 24.1 ms 1.75×
> Composite (3 col) Large 245.0 ms 162.0 ms 1.51×
> Composite (3 col) XLarge 1052.5 ms 716.5 ms 1.46×
>
> Summary: Fragmentation generally does not hurt streaming reads; most
> fragmented cases still see 1.4×–1.9× gains. One outlier (XLarge
> Timestamp) shows a slight regression (0.95×).
>
> ⸻
>
> Warm Cache Results
> When indexes are fully cached in shared_buffers:
> Unfragmented: infrequent little regression for small to medium size
> index(single digit ms variance, barely noticeable); small gains for
> large size index
> Fragmented: infrequent little regression for small to medium size
> index(single digit ms variance, barely noticeable); small gains for
> large size index
>
Fix indentation issue in v1.
Best,
Xuneng
Attachment | Content-Type | Size |
---|---|---|
v2-0001-pgstattuple-Use-streaming-read-API-in-pgstatindex.patch | application/octet-stream | 7.4 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Chao Li | 2025-10-13 03:33:06 | Re: [PING] [PATCH v2] parallel pg_restore: avoid disk seeks when jumping short distance forward |
Previous Message | Xuneng Zhou | 2025-10-13 02:07:07 | pgstattuple: Use streaming read API in pgstatindex functions |