| From: | Riaan Stander <rstander(at)exa(dot)co(dot)za> |
|---|---|
| To: | pgsql-performance(at)lists(dot)postgresql(dot)org |
| Subject: | Re: Postgres IO sweet spot |
| Date: | 2026-02-16 14:57:14 |
| Message-ID: | 06b1a8a9-6540-4938-b00b-7407f63e5a22@exa.co.za |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-performance |
Good day all
Just following up of there is any advice from the community. My original
post was very long, but just wanted to paint the picture.
In summary I just want to find out if anybody has some concrete advice
on storage devices that is acceptable for usage with Postgres,
especially latency. I've highlighted some tests I've done, but I need to
interpret the numbers correctly.
Any feedback on the following will help
* Acceptable write IO latency
o WAL
o Data
o Temp
o ...
* Acceptable read IO latency
* Any other storage/drive related advice
Regards
Riaan
On 2026/02/11 01:13, Riaan Stander wrote:
> Good day
>
> We host our own Postgres (v17) server on-prem as the backbone of our
> SaS application. It's a fairly busy OLTP application with a database
> per tenant strategy. This obviously does complicate our setup.
> Our hosting platform is as follows:
> 3 x Host Servers running Microsoft Storage Spaces in a 3 way mirror
> Ubuntu VM hosting Postgres
>
> A few months ago we had some severe performance issues with lots of
> queries and writing operations just pending. After some deep
> investigation we started realizing that it was disk IO causing the
> issue. We used iostat and could see the write await was above 30ms and
> sometimes even spiking much higher. This was resolved by moving our
> backups (made with Veeam) from backing up the primary to a slave on
> other infrastructure. Our current happy state where clients are not
> experiencing issues is a iostat write await of 5ms and lower.
>
> All was good for a few months until recently when this issue started
> again. This time it could not be the backups. We had various hardware
> vendors involved, but at some point it came to light that the Storage
> Spaces hardware are all mechanical disks with NVME only used for
> Storage Spaces journaling and caching. There are now some discussions
> of upgrading drives to SSD, but my concern is that this is not
> guaranteed to solve the issue. Especially with the 3 way mirror it
> seems all writes will go to the other hosts before returning. So
> latency is almost impossible to remove.
>
> So now my question. I started running some IO tests using fio,
> pg_test_fsync & pg_test_timing. Before we spend days/months trying to
> tune Postgres settings I'm trying to get some definitive published
> information about what IO numbers I should expect when running plain
> hardware tests with Postgres completely out of the loop. I've seen
> some info about 1ms and less write latency is what you want for WAL.
> My logic says that if you have a stiffie drive for storage you can
> tune it, but you still have a stiffie drive.
>
> These are the tests I've run so far
> 1. WAL-Style Latency Test (4K random sync writes)
> fio --name=wal-latency --filename=$TESTDIR/fio_wal_test --size=2G
> --rw=randwrite --bs=4k --iodepth=1 --ioengine=libaio --direct=1
> --fsync=1 --runtime=60 --group_reporting
>
> 2. Random Read IOPS Test (index lookup simulation)
> fio --name=index-read --filename=$TESTDIR/fio_index_test --size=8G
> --rw=randread --bs=4k --iodepth=32 --ioengine=libaio --direct=1
> --runtime=60 --group_reporting
>
> 3. Mixed OLTP Test (70% read / 30% write)
> fio --name=oltp-mixed --filename=$TESTDIR/fio_oltp_mixed --size=8G
> --rw=randrw --rwmixread=70 --bs=8k --iodepth=32 --ioengine=libaio
> --direct=1 --runtime=60 --group_reporting
>
> 4. Checkpoint Burst Test (sequential write pressure)
> fio --name=checkpoint-burst --filename=$TESTDIR/fio_checkpoint
> --size=20G --rw=write --bs=1M --iodepth=64 --ioengine=libaio
> --direct=1 --runtime=60 --group_reporting
>
> 5. PostgreSQL fsync Code Path Test
> pg_test_fsync -f $TESTDIR/pg_test_fsync
>
> 6. Timer / Scheduling Jitter Test
> pg_test_timing -d 3
>
> Regards
> Riaan
>
>
>
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Greg Sabino Mullane | 2026-02-16 15:34:47 | Re: Postgres IO sweet spot |
| Previous Message | Andrei Lepikhov | 2026-02-16 09:05:35 | Re: A serious change in performance between PG 15 and PG 16, 17, 18. |