From: | Heikki Linnakangas <hlinnaka(at)iki(dot)fi> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: AdvanceXLInsertBuffers() vs wal_sync_method=open_datasync |
Date: | 2023-11-10 15:16:35 |
Message-ID: | 0005ca83-c5e5-4ea7-94a6-17e973fa47d8@iki.fi |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 10/11/2023 05:54, Andres Freund wrote:
> In this case I had used wal_sync_method=open_datasync - it's often faster and
> if we want to scale WAL writes more we'll have to use it more widely (you
> can't have multiple fdatasyncs in progress and reason about which one affects
> what, but you can have multiple DSYNC writes in progress at the same time).
Not sure I understand that. If you issue an fdatasync, it will sync all
writes that were complete before the fdatasync started. Right? If you
have multiple fdatasyncs in progress, that's true for each fdatasync. Or
is there a bottleneck in the kernel with multiple in-progress fdatasyncs
or something?
> After a bit of confused staring and debugging I figured out that the problem
> is that the RequestXLogSwitch() within the code for starting a basebackup was
> triggering writing back the WAL in individual 8kB writes via
> GetXLogBuffer()->AdvanceXLInsertBuffer(). With open_datasync each of these
> writes is durable - on this drive each take about 1ms.
I see. So the assumption in AdvanceXLInsertBuffer() is that XLogWrite()
is relatively fast. But with open_datasync, it's not.
> To fix this, I suspect we need to make
> GetXLogBuffer()->AdvanceXLInsertBuffer() flush more aggressively. In this
> specific case, we even know for sure that we are going to fill a lot more
> buffers, so no heuristic would be needed. In other cases however we need some
> heuristic to know how much to write out.
+1. Maybe use the same logic as in XLogFlush().
I wonder if the 'flexible' argument to XLogWrite() is too inflexible. It
would be nice to pass a hard minimum XLogRecPtr that it must write up
to, but still allow it to write more than that if it's convenient.
--
Heikki Linnakangas
Neon (https://neon.tech)
From | Date | Subject | |
---|---|---|---|
Next Message | Nathan Bossart | 2023-11-10 16:36:08 | Re: CRC32C Parallel Computation Optimization on ARM |
Previous Message | jian he | 2023-11-10 14:59:27 | Re: EXCLUDE COLLATE in CREATE/ALTER TABLE document |