WAL prefetch

From: Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>
To: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Cc: Sean Chittenden <seanc(at)joyent(dot)com>
Subject: WAL prefetch
Date: 2018-06-13 13:09:45
Message-ID: 49df9cd2-7086-02d0-3f8d-535a32d44c82@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

There was very interesting presentation at pgconf about pg_prefaulter:

http://www.pgcon.org/2018/schedule/events/1204.en.html

But it is implemented in GO and using pg_waldump.
I tried to do the same but using built-on Postgres WAL traverse functions.
I have implemented it as extension for simplicity of integration.
In principle it can be started as BG worker.

First of all I tried to estimate effect of preloading data.
I have implemented prefetch utility with is also attached to this mail.
It performs random reads of blocks of some large file and spawns some
number of prefetch threads:

Just normal read without prefetch:
./prefetch -n 0 SOME_BIG_FILE

One prefetch thread which uses pread:
./prefetch SOME_BIG_FILE

One prefetch thread which uses posix_fadvise:
./prefetch -f SOME_BIG_FILE

4 prefetch thread which uses posix_fadvise:
./prefetch -f -n 4 SOME_BIG_FILE

Based on this experiments (on my desktop), I made the following conclusions:

1. Prefetch at HDD doesn't give any positive effect.
2. Using posix_fadvise allows to speed-up random read speed at SSD up to
2 times.
3. posix_fadvise(WILLNEED) is more efficient than performing normal reads.
4. Calling posix_fadvise in more than one thread has no sense.

I have tested wal_prefetch at two powerful servers with 24 cores, 3Tb
NVME RAID 10 storage device and 256Gb of RAM connected using InfiniBand.
The speed of synchronous replication between two nodes is increased from
56k TPS to 60k TPS (on pgbench with scale 1000).

Usage:
1. At master: create extension wal_prefetch
2. At replica: Call pg_wal_prefetch() function: it will not return until
you interrupt it.

pg_wal_prefetch function will infinitely traverse WAL and prefetch block
references in WAL records
using posix_fadvise(WILLNEED) system call.

It is possible to explicitly specify start LSN for pg_wal_prefetch()
function. Otherwise, WAL redo position will be used as start LSN.

--

Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment Content-Type Size
prefetch.c text/x-csrc 2.3 KB
wal_prefetch.tgz application/x-compressed-tar 6.8 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2018-06-13 13:15:19 Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)
Previous Message Tomas Vondra 2018-06-13 13:03:03 Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)