Re: [PATCH] Atomic pgrename on Windows

From: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
To: Craig Ringer <craig(at)2ndquadrant(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] Atomic pgrename on Windows
Date: 2017-11-28 16:16:46
Message-ID: CAPpHfds9trA6ipezK3BsuuOSQwEmESiqj8pkOxACFJpoLpcoNw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Nov 28, 2017 at 5:02 AM, Craig Ringer <craig(at)2ndquadrant(dot)com> wrote:

> On 27 November 2017 at 14:28, Alexander Korotkov <
> a(dot)korotkov(at)postgrespro(dot)ru> wrote:
>
>> It's assumed in PostgreSQL codebase that pgrename atomically replaces
>> target file with source file even if target file is open and being read by
>> another process. And this assumption is true on Linux, but it's false on
>> Windows. MoveFileEx() triggers an error when target file is open (and
>> accordingly locked). Some our customers has been faced such errors while
>> operating heavily loaded PostgreSQL instance on Windows.
>>
>> LOG could not rename temporary statistics file "pg_stat_tmp/global.tmp"
>> to "pg_stat_tmp/global.stat": Permission denied
>>
>
> That would explain a number of intermittent reports Iv'e seen floating
> around.
>

Yeah.

Attached patch atomic-pgrename-windows-1.patch fixes this problem. It
>> appears to be possible to atomically replace file on Windows –
>> ReplaceFile() does that. ReplaceFiles() requires target file to exist,
>> this is why we still need to call MoveFileEx() when it doesn't exist.
>>
>
> Look at the error codes for ReplaceFile:
>
> https://msdn.microsoft.com/en-us/library/aa365512(VS.85).aspx
>
> The docs don't say it's atomic and the error codes suggest it may not be.
> But there's a Microsoft Research paper claiming it's atomic -
> http://research.microsoft.com/pubs/64525/tr-2006-45.pdf .
>
> It appears that MoveFileTransacted would be what we really want, when
> we're on NTFS (it won't work on FAT32 or network shares). See
> https://msdn.microsoft.com/en-us/library/windows/
> desktop/aa365241(v=vs.85).aspx . But the docs have a preface warning that
> it's not recommended and may not be available in future Windows versions,
> so that's not an option.
>
> This Go language bug (https://github.com/golang/go/issues/8914) and this
> cPython discussion (http://bugs.python.org/issue8828)) have discussion.
>
> I found this comment particularly illuminating https://bugs.python.org/
> msg146307 as it quotes what Java does. It uses MoveFileEx.
>

That's look bad. MoveFileTransacted() looks like best option, but it's not
an option since it's deprecated.
For now, ReplaceFile() function looks like best choice for renaming
statfiles. But it doesn't look good for critical datafiles whose are also
renamed using pgrename, because system can crash between rename of
destination file and rename of source file. Thus, MoveFileEx() seems still
best solution for critical datafiles where safety is more important than
concurrency. After reading provided links, I'm very suspicious about its
safety too. But it's seems like best available solution and it have
already passed some test of time...

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2017-11-28 16:21:16 Re: [HACKERS] postgres_fdw bug in 9.6
Previous Message Robert Haas 2017-11-28 16:12:47 Re: explain analyze output with parallel workers - question about meaning of information for explain.depesz.com