RE: [HACKERS][PATCH] Applying PMDK to WAL operations for persistent memory

From: "Tsunakawa, Takayuki" <tsunakawa(dot)takay(at)jp(dot)fujitsu(dot)com>
To: 'Robert Haas' <robertmhaas(at)gmail(dot)com>
Cc: Yoshimi Ichiyanagi <ichiyanagi(dot)yoshimi(at)lab(dot)ntt(dot)co(dot)jp>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, "menjo(dot)takashi(at)lab(dot)ntt(dot)co(dot)jp" <menjo(dot)takashi(at)lab(dot)ntt(dot)co(dot)jp>, "ishizaki(dot)teruaki(at)lab(dot)ntt(dot)co(dot)jp" <ishizaki(dot)teruaki(at)lab(dot)ntt(dot)co(dot)jp>
Subject: RE: [HACKERS][PATCH] Applying PMDK to WAL operations for persistent memory
Date: 2018-01-25 03:31:13
Message-ID: 0A3221C70F24FB45833433255569204D1F8A4BF7@G01JPEXMBYT05
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

From: Robert Haas [mailto:robertmhaas(at)gmail(dot)com]
> I think open_datasync will be worse on systems where fsync() is expensive
> -- it forces the data out to disk immediately, even if the data doesn't
> need to be flushed immediately. That's bad, because we wait immediately
> when we could have deferred the wait until later and maybe gotten the WAL
> writer to do the work in the background. But it might be better on systems
> where fsync() is basically free, because there you might as well just get
> it out of the way immediately and not leave something left to be done later.
>
> This is just a guess, of course. You didn't mention what the underlying
> storage for your test was?

Uh, your guess was correct. My file system was ext3, where fsync() writes all dirty buffers in page cache.

As you said, open_datasync was 20% faster than fdatasync on RHEL7.2, on a LVM volume with ext4 (mounted with options noatime, nobarrier) on a PCIe flash memory.

5 seconds per test
O_DIRECT supported on this platform for open_datasync and open_sync.

Compare file sync methods using one 8kB write:
(in wal_sync_method preference order, except fdatasync is Linux's default)
open_datasync 50829.597 ops/sec 20 usecs/op
fdatasync 42094.381 ops/sec 24 usecs/op
fsync 42209.972 ops/sec 24 usecs/op
fsync_writethrough n/a
open_sync 48669.605 ops/sec 21 usecs/op

Compare file sync methods using two 8kB writes:
(in wal_sync_method preference order, except fdatasync is Linux's default)
open_datasync 26366.373 ops/sec 38 usecs/op
fdatasync 33922.725 ops/sec 29 usecs/op
fsync 32990.209 ops/sec 30 usecs/op
fsync_writethrough n/a
open_sync 24326.249 ops/sec 41 usecs/op

What do you think about changing the default value of wal_sync_method on Linux in PG 11? I can understand the concern that users might hit performance degredation if they are using PostgreSQL on older systems. But it's also mottainai that many users don't notice the benefits of wal_sync_method = open_datasync on new systems.

Regards
Takayuki Tsunakawa

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David G. Johnston 2018-01-25 03:36:05 Further cleanup of pg_dump/pg_restore item selection code
Previous Message Tom Lane 2018-01-25 03:12:49 Re: [HACKERS] Patch: Add --no-comments to skip COMMENTs with pg_dump