Re: AdvanceXLInsertBuffers() vs wal_sync_method=open_datasync

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: AdvanceXLInsertBuffers() vs wal_sync_method=open_datasync
Date: 2023-11-10 15:16:35
Message-ID: 0005ca83-c5e5-4ea7-94a6-17e973fa47d8@iki.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 10/11/2023 05:54, Andres Freund wrote:
> In this case I had used wal_sync_method=open_datasync - it's often faster and
> if we want to scale WAL writes more we'll have to use it more widely (you
> can't have multiple fdatasyncs in progress and reason about which one affects
> what, but you can have multiple DSYNC writes in progress at the same time).

Not sure I understand that. If you issue an fdatasync, it will sync all
writes that were complete before the fdatasync started. Right? If you
have multiple fdatasyncs in progress, that's true for each fdatasync. Or
is there a bottleneck in the kernel with multiple in-progress fdatasyncs
or something?

> After a bit of confused staring and debugging I figured out that the problem
> is that the RequestXLogSwitch() within the code for starting a basebackup was
> triggering writing back the WAL in individual 8kB writes via
> GetXLogBuffer()->AdvanceXLInsertBuffer(). With open_datasync each of these
> writes is durable - on this drive each take about 1ms.

I see. So the assumption in AdvanceXLInsertBuffer() is that XLogWrite()
is relatively fast. But with open_datasync, it's not.

> To fix this, I suspect we need to make
> GetXLogBuffer()->AdvanceXLInsertBuffer() flush more aggressively. In this
> specific case, we even know for sure that we are going to fill a lot more
> buffers, so no heuristic would be needed. In other cases however we need some
> heuristic to know how much to write out.

+1. Maybe use the same logic as in XLogFlush().

I wonder if the 'flexible' argument to XLogWrite() is too inflexible. It
would be nice to pass a hard minimum XLogRecPtr that it must write up
to, but still allow it to write more than that if it's convenient.

--
Heikki Linnakangas
Neon (https://neon.tech)

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2023-11-10 16:36:08 Re: CRC32C Parallel Computation Optimization on ARM
Previous Message jian he 2023-11-10 14:59:27 Re: EXCLUDE COLLATE in CREATE/ALTER TABLE document