Re: Atomic rename feature for Windows.

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Victor Spirin <v(dot)spirin(at)postgrespro(dot)ru>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Michael Paquier <michael(at)paquier(dot)xyz>
Subject: Re: Atomic rename feature for Windows.
Date: 2021-09-07 01:40:51
Message-ID: CA+hUKGKXQPyb8CrBNp5uu5e2ZtHLiqPWR_DoZfS7NO8GeyiSxw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Sep 7, 2021 at 5:44 AM Victor Spirin <v(dot)spirin(at)postgrespro(dot)ru> wrote:
> I have changed the way I add the manifest to projects. I used the
> AdditionalManifestFiles option for a VS project.

Hi Victor,

Thanks for working on this!

I wonder if POSIX-style rename is used automatically on recent
Windows, based on the new clue that DeleteFile() has started
defaulting to POSIX semantics[1] (maybe it would require ReplaceFile()
instead of MoveFileEx(), but I have no idea.) If so, one question is
whether we'd still want to do this explicit POSIX rename dance, or
whether we should just wait a bit longer for it to happen
automatically on all relevant systems (plus tweak to use ReplaceFile()
if necessary). If not, we might want to do what you're proposing
anyway, especially if ReplaceFile() is required, because its interface
is weird (it only works if the other file exists?). Hmm, that alone
would be a good reason to go with your plan regardless, and of course
it would be good to see this fixed everywhere ASAP.

We still have to answer that question for pgunlink(). I was
contemplating that over in that other thread, because unlink() ->
EACCES is blocking something I'm working on. I found a partial
solution to that that works even on old and non-NTFS systems, and I
was thinking that would be enough for now and we could just patiently
wait until automatic POSIX semantics to arrives on all relevant
systems as the real long term solution, so I didn't need to expend
energy doing an intermediate explicit POSIX-mode wrapper like what
you're proposing. But then it seems strange to make a different
choice about that for rename() and unlink(). So... do you think it
would make sense to extend your patch to cover unlink() too?

It would be great to have a tool in the tree that tests directory
entry semantics, called something like src/bin/pg_test_dirmod, so that
it becomes very clear when POSIX semantics are being used. It could
test various interesting unlink and rename scenarios through our
wrappers (concurrent file handles, creating a new file with the name
of the old one, unlinking the containing directory, ...). It could
run on the build farm animals, and we could even ask people to run it
when they report problems, to try to lift the fog of bizarro Windows
file system semantics.

How exactly does the function fail on a file system that doesn't
support the new POSIX semantics? Assuming there is something like
ENOSUPP to report "file system says no", do we want to keep trying
every time, or remember that it doesn't work? I guess the answer may
vary per mount point, which makes it hard to track when you only have
an fd...

If it fails because of a sharing violation, it seems strange that we
immediately fall back to the old code to do the traditional (and
horrible) sleep/retry loop. That means that in rare conditions we can
still get the old behaviour that leads to problems, just because of a
transient condition. Hmm. Would it make more sense to say: fall back
to the traditional behaviour only for ENOSUPP (if there is such a
thing), cope with transient sharing violations without giving up on
POSIX semantics, and report all other failures immediately?

I agree that the existing enum CHECKSUM_TYPE_NONE + friends should be
renamed to something less collision-prone and more consistent with the
name of the enum ("pg_checksum_type"), so I'd vote for adding a PG_
prefix, in a separate patch.

+ <Manifest>
+ <AdditionalManifestFiles>src/port/win10.manifest</AdditionalManifestFiles>
+ </Manifest>

I have no opinion on how you're supposed to test for OS versions, but
one trivial observation is that that file declares support for many
Windows releases, and I guess pretty soon you'll need to add 11, and
then we'll wonder why it says 10 in the file name. Would it be better
as "windows.manifest" or something?

+pgrename_win10(const char *from, const char *to)

Same thought on the name: this'll age badly. What about something
like pgrename_windows_posix_semantics?

+typedef struct _FILE_RENAME_INFO_VVS {
+ union {
+ BOOLEAN ReplaceIfExists; // FileRenameInfo
+ DWORD Flags; // FileRenameInfoEx
+ } DUMMYUNIONNAME;
+ HANDLE RootDirectory;
+ DWORD FileNameLength;
+ WCHAR FileName[MAX_PATH];
+} FILE_RENAME_INFO_VVS;

Why can't we use a system header[2] for this?

+ if (MultiByteToWideChar(CP_ACP, 0, (LPCCH)from, -1,
(LPWSTR)from_w, MAX_PATH) == 0) return -1;
+ if (MultiByteToWideChar(CP_ACP, 0, (LPCCH)to, -1,
(LPWSTR)rename_info.FileName, MAX_PATH) == 0) return -1;

Don't these need _dosmaperr(GetLastError())?

[1] https://www.postgresql.org/message-id/20210905214437.y25j42yigwnbdvtg%40alap3.anarazel.de
[2] https://docs.microsoft.com/en-us/windows/win32/api/winbase/ns-winbase-file_rename_info

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2021-09-07 02:08:24 Re: The Free Space Map: Problems and Opportunities
Previous Message kuroda.hayato@fujitsu.com 2021-09-07 01:32:43 RE: Allow escape in application_name (was: [postgres_fdw] add local pid to fallback_application_name)