Re: [HACKERS] WAL logging problem in 9.4.3?

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, 9erthalion6(at)gmail(dot)com, andrew(dot)dunstan(at)2ndquadrant(dot)com, hlinnaka(at)iki(dot)fi, robertmhaas(at)gmail(dot)com, michael(at)paquier(dot)xyz
Subject: Re: [HACKERS] WAL logging problem in 9.4.3?
Date: 2019-10-25 04:12:51
Message-ID: 20191025.131251.322449872063947371.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello. Thanks for the comment.

# Sorry in advance for possilbe breaking the thread.

> MarkBufferDirtyHint() writes WAL even when rd_firstRelfilenodeSubid or
> rd_createSubid is set; see attached test case. It needs to skip WAL whenever
> RelationNeedsWAL() returns false.

Thanks for pointing out that. And the test patch helped me very much.

Most of callers can tell that to the function, but SetHintBits()
cannot easily. Rather I think we shouldn't even try to do
that. Instead, In the attached, MarkBufferDirtyHint() asks storage.c
for sync-pending state of the relfilenode for the buffer. In the
attached patch (0003) RelFileNodeSkippingWAL loops over pendingSyncs
but it is called only at the time FPW is added so I believe it doesn't
affect performance so much. However, we can use hash for pendingSyncs
instead of liked list. Anyway the change is in its own file
v21-0003-Fix-MarkBufferDirtyHint.patch, which will be merged into
0002.

AFAICS all XLogInsert is guarded by RelationNeedsWAL() or in the
non-wal_minimal code paths.

> Cylinder and track sizes are obsolete as user-visible concepts. (They're not
> onstant for a given drive, and I think modern disks provide no way to read
> the relevant parameters.) I like the name "wal_skip_threshold", and my second

I strongly agree. Thanks for the draft. I used it as-is. I don't come
up with an appropriate second description of the GUC so I just removed
it.

# it was "For rotating magnetic disks, it is around the size of a
# track or sylinder."

> the relevant parameters.) I like the name "wal_skip_threshold", and
> my second choice would be "wal_skip_min_size". Possibly documented
> as follows:
..
> Any other opinions on the GUC name?

I prefer the first candidate. I already used the terminology in
storage.c and the name fits more to the context.

> * We emit newpage WAL records for smaller size of relations.
> *
> * Small WAL records have a chance to be emitted at once along with
> * other backends' WAL records. We emit WAL records instead of syncing
> * for files that are smaller than a certain threshold expecting faster
- * commit. The threshold is defined by the GUC effective_io_block_size.
+ * commit. The threshold is defined by the GUC wal_skip_threshold.

The attached are:

- v21-0001-TAP-test-for-copy-truncation-optimization.patch
same as v20

- v21-0002-Fix-WAL-skipping-feature.patch
GUC name changed.

- v21-0003-Fix-MarkBufferDirtyHint.patch
PoC of fixing the function. will be merged into 0002. (New)

- v21-0004-Documentation-for-wal_skip_threshold.patch
GUC name and description changed. (Previous 0003)

- v21-0005-Additional-test-for-new-GUC-setting.patch
including adjusted version of wal-optimize-noah-tests-v3.patch
Maybe test names need further adjustment. (Previous 0004)

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment Content-Type Size
v21-0001-TAP-test-for-copy-truncation-optimization.patch text/x-patch 11.3 KB
v21-0002-Fix-WAL-skipping-feature.patch text/x-patch 49.9 KB
v21-0003-Fix-MarkBufferDirtyHint.patch text/x-patch 2.3 KB
v21-0004-Documentation-for-wal_skip_threshold.patch text/x-patch 1.8 KB
v21-0005-Additional-test-for-new-GUC-setting.patch text/x-patch 2.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro Horiguchi 2019-10-25 04:46:14 Re: [HACKERS] WAL logging problem in 9.4.3?
Previous Message Dilip Kumar 2019-10-25 03:44:16 Re: [HACKERS] Block level parallel vacuum