Re: wal_sync_method=fsync_writethrough

From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: wal_sync_method=fsync_writethrough
Date: 2022-08-29 15:44:25
Message-ID: CABUevEz2_HuNJP+gTjaJ9uDvAaC4RUr35Q1+-6L2VX6RDB_AGw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Aug 26, 2022 at 11:29 PM Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
>
> On Sat, Aug 27, 2022 at 12:17 AM Magnus Hagander <magnus(at)hagander(dot)net> wrote:
> > So, I don't know how it works now, but the history at least was this:
> > it was not about the disk caches, it was about raid controller caches.
> > Basically, we determined that windows didn't fsync it all the way. But
> > it would with But if we changed wal_sync_method=fsync to actually
> > *do* that, then people who had paid big money for raid controllers
> > with flash or battery backed cache would lose a ton of performance. So
> > we needed one level that would sync out of the OS but not through the
> > RAID cache, and another one that would sync it out of the RAID cache
> > as well. Which would/could be different from the drive caches
> > themselves, and they often behaved differently. And I think it may
> > have even been dependent on the individual RAID drivers what the
> > default would be.
>
> Thanks for the background. Yeah, that makes sense to motivate
> open_datasync for Windows. Not sure what you meant about fsync or
> meant to write after "would with".

That's a good question indeed :) I think I meant it would with
FILE_FLAG_WRITE_THROUGH.

> It seems like the 2005 discussions were primarily about open_datasync
> but also had the by-product of introducing the name
> fsync_writethrough. If I'm reading between the lines[1] correctly,
> perhaps the logic went like this:
>
> 1. We noticed that _commit() AKA FlushFileBuffers() issued
> SYNCHRONIZE CACHE (or equivalent) on Windows.
>
> 2. At that time in history, Linux (and other Unixes) probably did not
> issue SYNCHRONIZE CACHE when you called fsync()/fdatasync().

I think it may have been driver dependent there (as well), at the time.

> 3. We concluded therefore that Windows was strange and we needed to
> use a different level name for the setting to reflect this extra
> effect.

It was certainly strange to us :)

> Now it looks strange: we have both "fsync" and "fsync_writethrough"
> doing exactly the same thing while vaguely implying otherwise, and the
> contrast with other operating systems (if I divined that aspect
> correctly) mostly doesn't apply. How flush commands affect various
> caches in modern storage stacks is also not really OS-specific AFAIK.
>
> (Obviously macOS is a different story...)

Given that it does vary (because macOS is actually an OS :D), we might
need to start from a matrix of exactly what happens in different
states, and then try to map that to a set? I fully agree that if
things actually behave the same, they should be called the same.

And it may also be that there is no longer a difference between
direct-drive and RAID-with-battery-or-flash, which used to be the huge
difference back then, where you had to tune for it. For many cases
that has been negated by just not using that (and using NVME and
possibly software raid instead), but there are certainly still people
using such systems...

//Magnus

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2022-08-29 15:57:45 Re: Reducing the chunk header sizes on all memory context types
Previous Message Tom Lane 2022-08-29 15:43:14 Re: Reducing the chunk header sizes on all memory context types