Re: [HACKERS][PATCH] Applying PMDK to WAL operations for persistent memory

From: Yoshimi Ichiyanagi <ichiyanagi(dot)yoshimi(at)lab(dot)ntt(dot)co(dot)jp>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "Tsunakawa, Takayuki" <tsunakawa(dot)takay(at)jp(dot)fujitsu(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, "menjo(dot)takashi(at)lab(dot)ntt(dot)co(dot)jp" <menjo(dot)takashi(at)lab(dot)ntt(dot)co(dot)jp>, "ishizaki(dot)teruaki(at)lab(dot)ntt(dot)co(dot)jp" <ishizaki(dot)teruaki(at)lab(dot)ntt(dot)co(dot)jp>
Subject: Re: [HACKERS][PATCH] Applying PMDK to WAL operations for persistent memory
Date: 2018-01-30 08:37:39
Message-ID: C5BD399A5962A7DAD59E3A1@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

<CA+TgmoZygQO3EC4mMdf-b=UuY3HZz6+-Y2w5_s9bLtH4NPw6Bg(at)mail(dot)gmail(dot)com>
Fri, 19 Jan 2018 09:42:25 -0500Robert Haas <robertmhaas(at)gmail(dot)com> wrote
:
>
>I think that you really need to include the checkpoints in the tests.
>I would suggest setting max_wal_size and/or checkpoint_timeout so that
>you reliably complete 2 checkpoints in a 30-minute test, and then do a
>comparison on that basis.

Experimental setup:
-------------------------
Server: HP ProLiant DL360 Gen9
CPU: Xeon E5-2667 v4 (3.20GHz); 2 processors(without HT)
DRAM: DDR4-2400; 32 GiB/processor
(8GiB/socket x 4 sockets/processor) x 2 processors
NVDIMM: DDR4-2133; 32 GiB/processor
(node 0: 8GiB/socket x 2 sockets/processor,
node 1: 8GiB/socket x 6 sockets/processor)
HDD: Seagate Constellation2 2.5inch SATA 3.0. 6Gb/s 1TB 7200rpm x 1
SATA-SSD: Crucial_CT500MX200SSD1 (SATA 3.2, SATA 6Gb/s)
OS: Ubuntu 16.04, linux-4.12
DAX FS: ext4
PMDK: master(at)Aug 30, 2017
PostgreSQL: master
Note: I bound the postgres processes to one NUMA node,
and the benchmarks to other NUMA node.
-------------------------

postgresql.conf
-------------------------
# - Settings -
wal_level = replica
fsync = on
synchronous_commit = on
wal_sync_method = pmem_drain/fdatasync/open_datasync
full_page_writes = on
wal_compression = off

# - Checkpoints -
checkpoint_timeout = 12min
max_wal_size = 20GB
min_wal_size = 20GB
-------------------------

Executed commands:
--------------------------------------------------------------------
# numactl -N 1 pg_ctl start -D [PG_DIR] -l [LOG_FILE]
# numactl -N 0 pgbench -s 200 -i [DB_NAME]
# numactl -N 0 pgbench -c 32 -j 32 -T 1800 -r [DB_NAME] -M prepared
--------------------------------------------------------------------

The results:
--------------------------------------------------------------------
A) Applied the patches to PG src, and compiled PG with libpmem
B) Applied the patches to PG src, and compiled PG without libpmem
C) Original PG

The averages of running pgbench three times on *PMEM* are:
A)
wal_sync_method = pmem_drain tps = 41660.42524
wal_sync_method = open_datasync tps = 39913.49897
wal_sync_method = fdatasync tps = 39900.83396

C)
wal_sync_method = open_datasync tps = 40335.50178
wal_sync_method = fdatasync tps = 40649.57772

The averages of running pgbench three times on *SATA-SSD* are:
B)
wal_sync_method = open_datasync tps = 7224.07146
wal_sync_method = fdatasync tps = 7222.19177

C)
wal_sync_method = open_datasync tps = 7258.79093
wal_sync_method = fdatasync tps = 7263.19878
--------------------------------------------------------------------

From the above results, it show that wal_sync_method=pmem_drain was
about faster than wal_sync_method=open_datasync/fdatasync.
When pgbench ran on SATA-SSD, wal_sync_method=fdatasync was as fast
as wal_sync_method=open_datasync.

>> Do you know any good WAL I/O intensive benchmarks? DBT2?
>
>pgbench is quite a WAL-intensive benchmark; it is much more
>write-heavy than what most systems experience in real life, at least
>in my experience. Your comparison of DAX FS to DAX FS + PMDK is very
>interesting, but in real life the bandwidth of DAX FS is already so
>high -- and the latency so low -- that I think most real-world
>workloads won't gain very much. At least, that is my impression based
>on internal testing EnterpriseDB did a few months back. (Thanks to
>Mithun and Kuntal for that work.)

In the near future, many physical devices will send sensing data
(IoT might allow devices to exhaust tens Giga network bandwidth).
The amount of data inserted in the DB will significantly increase.
I think that PMEM will be needed for use cases like IoT.

<CA+TgmobDO4qj2nMLdm2Dv5VRT8cVQjv7kftsS_P-kNpNw=TRug(at)mail(dot)gmail(dot)com>
Thu, 25 Jan 2018 09:30:45 -0500Robert Haas <robertmhaas(at)gmail(dot)com> wrote
:
>Well, some day persistent memory may be a common enough storage
>technology that such a change makes sense, but these days most people
>have either SSD or spinning disks, where the change would probably be
>a net negative. It seems more like something we might think about
>changing in PG 20 or PG 30.
>

Oracle and Microsoft SQL Server suported PMEM [1][2].
I think it is not too early for PostgreSQL to support PMEM.

[1] http://dbheartbeat.blogspot.jp/2017/11/doag-2017-oracle-18c-dbim-oracle.htm
[2] https://www.snia.org/sites/default/files/PM-Summit/2018/presentations/06_PM_Summit_2018_Talpey-Final_Post-CORRECTED.pdf

--
Yoshimi Ichiyanagi

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Abinaya Kajendiran 2018-01-30 09:12:26 Regarding drop index
Previous Message Michael Meskes 2018-01-30 08:31:44 Re: [HACKERS] datetime.h defines like PM conflict with external libraries