| From: | Xuneng Zhou <xunengzhou(at)gmail(dot)com> |
|---|---|
| To: | Michael Paquier <michael(at)paquier(dot)xyz> |
| Cc: | Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com> |
| Subject: | Re: Streamify more code paths |
| Date: | 2026-03-15 03:47:06 |
| Message-ID: | CABPTF7WONbvOOJPKwpPmh0Pu6scFZbCwxKq5ThdPiWEfdMjfjA@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Sun, Mar 15, 2026 at 10:51 AM Xuneng Zhou <xunengzhou(at)gmail(dot)com> wrote:
>
> On Sat, Mar 14, 2026 at 5:56 PM Michael Paquier <michael(at)paquier(dot)xyz> wrote:
> >
> > On Fri, Mar 13, 2026 at 10:39:52AM +0800, Xuneng Zhou wrote:
> > > Thanks for fixing this and for taking the time to review and test
> > > the patches.
> >
> > Looking at the rest, I have produced some numbers:
> > pgstattuple_small (20k tuples, io_uring) base= 60839.9ms
> > patch=10949.9ms 5.56x ( 82.0%) (reads=4139->260,
> > io_time=49616.97->55.25ms)
> > pgstattuple_small (20k tuples, worker=3) base= 60577.5ms
> > patch=11470.0ms 5.28x ( 81.1%) (reads=4139->260,
> > io_time=49359.79->69.60ms)
> > hash_vacuum (1M tuples, io_uring) base=199929.0ms patch=161747.0ms
> > 1.24x ( 19.1%) (reads=4665->1615, io_time=47084.8->9925.77ms)
> > hash_vacuum (1M tuples, worker=12) base=203417.0ms patch=161687.0ms
> > 1.26x ( 20.5%) (reads=4665->1615, io_time=48356.3->9917.24ms)
> >
> > The hash vacuum numbers are less amazing here than yours. Trying out
> > various configurations does not change the results much (I was puzzled
> > for a couple of hours that I did not see any performance impact but
> > forgot the eviction of the index pages from the shared buffers, that
> > influences the numbers to what I have here), but I'll take it anyway.
>
> My guess is that the results are influenced by the write delay. Vacuum
> operations can be write-intensive, so when both read and write delays
> are set to 2 ~ 5 ms, a large portion of the runtime may be spent on
> writes. According to Amdahl’s Law, the overall performance improvement
> from optimizing a single component is limited by the fraction of time
> that component actually contributes to the total execution time. In
> this case, the potential rate of speedup from streaming the read path
> could be masked by the time spent performing writes.
>
> To investigate this, I added a new option, write-delay. When it is set
> to zero, the benchmark simulates a system with a fast write device and
> a slow read device, reducing the proportion of time spent on writes.
> Admittedly, this setup is somewhat artificial—we would not normally
> expect such a large discrepancy between read and write performance in
> real systems.
>
> -- worker 12, write-delay 2 ms
> hash_vacuum_medium base= 33743.2ms patch= 27371.3ms 1.23x
> ( 18.9%) (reads=4662→1612, read_time=8242.51→1725.03ms,
> writes=12689→12651, write_time=25144.87→25041.75ms)
>
> -- worker 12, write-delay 0 ms
> hash_vacuum_medium base= 8601.1ms patch= 2234.0ms 3.85x
> ( 74.0%) (reads=4662→1612, read_time=8021.65→1637.87ms,
> writes=12689→12651, write_time=337.38→288.15ms)
>
> To better understand the behavior, the latest version of the script
> separates the I/O time into read time and write time. This allows us
> to directly observe their respective contributions and how they change
> across runs. A further improvement would be to report the speedup for
> the read and write components separately, making it easier to
> understand where and how much the performance gains actually occur.
The updated script now reports speedup separately for the read and
write paths like this:
hash_vacuum_medium base= 33747.2ms patch= 27379.7ms 1.23x ( 18.9%)
read: 4662→1612 ops 8238.72→1725.86ms
(4.77x) write: 12689→12651 ops 25146.51→25053.57ms (1.00x)
I think it is useful to keep the write-delay option even with this
reporting. Separating the read and write delays also helps reduce the
overall runtime of the tests, especially for large data sizes: we only
slow down the read path while keeping the write path fast.
--
Best,
Xuneng
| Attachment | Content-Type | Size |
|---|---|---|
| run_streaming_benchmark.sh | text/x-sh | 35.4 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Mahendra Singh Thalor | 2026-03-15 03:48:26 | pg_restore: remove unnecessary code from restore_all_databases function |
| Previous Message | Xuneng Zhou | 2026-03-15 02:51:05 | Re: Streamify more code paths |