RE: [PoC] Non-volatile WAL buffer

From: "tsunakawa(dot)takay(at)fujitsu(dot)com" <tsunakawa(dot)takay(at)fujitsu(dot)com>
To: 'Tomas Vondra' <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Takashi Menjo <takashi(dot)menjo(at)gmail(dot)com>, Takashi Menjo <takashi(dot)menjou(dot)vg(at)hco(dot)ntt(dot)co(dot)jp>, "Deng, Gang" <gang(dot)deng(at)intel(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: RE: [PoC] Non-volatile WAL buffer
Date: 2021-01-28 02:45:46
Message-ID: TYAPR01MB29903EB512A3A47E7099E8E8FEBA9@TYAPR01MB2990.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
> (c) As mentioned before, PMEM behaves differently with concurrent
> access, i.e. it reaches peak throughput with relatively low number of
> threads wroting data, and then the throughput drops quite quickly. I'm
> not sure if the same thing applies to pmem_drain() too - if it does, we
> may need something like we have for insertions, i.e. a handful of locks
> allowing limited number of concurrent inserts.

> I think WALWriteLock itself (i.e. acquiring/releasing it) is not an
> issue - the problem is that writing the WAL to persistent storage itself
> is expensive, and we're waiting to that.
>
> So it's not clear to me if removing the lock (and allowing multiple
> processes to do pmem_drain concurrently) can actually help, considering
> pmem_drain() should flush writes from other processes anyway.

I may be out of the track, but HPE's benchmark using Oracle 18c, placing the REDO log file on Intel PMEM in App Direct mode, showed only 27% performance increase compared to even "SAS" SSD.

https://h20195.www2.hpe.com/v2/getdocument.aspx?docname=a00074230enw

The just-released Oracle 21c has started support for placing data files on PMEM, eliminating the overhead of buffer cache. It's interesting that this new feature is categorized in "Manageability", not "Performance and scalability."

https://docs.oracle.com/en/database/oracle/oracle-database/21/nfcon/persistent-memory-database-258797846.html

They recommend placing REDO logs on DAX-aware file systems. I ownder what's behind this.

https://docs.oracle.com/en/database/oracle/oracle-database/21/admin/using-PMEM-db-support.html#GUID-D230B9CF-1845-4833-9BF7-43E9F15B7113

"You can use PMEM Filestore for database datafiles and control files. For performance reasons, Oracle recommends that you store redo log files as independent files in a DAX-aware filesystem such as EXT4/XFS."

Regards
Takayuki Tsunakawa

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2021-01-28 02:47:07 Re: On login trigger: take three
Previous Message Tang, Haiying 2021-01-28 02:40:32 RE: [POC] Fast COPY FROM command for the table with foreign partitions