Re: stat() vs ERROR_DELETE_PENDING, round N + 1

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Alexander Lakhin <exclusion(at)gmail(dot)com>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)anarazel(dot)de>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Juan José Santamaría Flecha <juanjo(dot)santamaria(at)gmail(dot)com>
Subject: Re: stat() vs ERROR_DELETE_PENDING, round N + 1
Date: 2021-09-10 05:04:09
Message-ID: CA+hUKGKt_Q8cgptOQZPOYChJzx6u5_1jw=eRyV_hfaTLjRoAYQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Sep 7, 2021 at 7:00 PM Alexander Lakhin <exclusion(at)gmail(dot)com> wrote:
> 07.09.2021 09:05, Michael Paquier wrote:
> > On Tue, Sep 07, 2021 at 09:00:01AM +0300, Alexander Lakhin wrote:
> >> The new approach looks very promising. Knowing that the file is really
> >> in the DELETE_PENDING state simplifies a lot.
> >> I've tested the patch v2_0001_Check... with my demo tests [1] and [2],
> >> and it definitely works.

> > Oho, nice. Just to be sure. You are referring to
> > v2-0001-Check*.patch posted here, right?
> > https://www.postgresql.org/message-id/CA+hUKGKj3p+2AciBGacCf_cXE0JLCYevWHexvOpK6uL1+V-zag@mail.gmail.com

> Yes, i've tested that one, on the master branch (my tests needed a minor
> modification due to PostgresNode changes).

Thanks very much!

Time to tidy up some loose ends. There are a couple of judgement
calls involved. Here's what Andres and I came up with in an off-list
chat. Any different suggestions?

1. I abandoned the "direct NtCreateFile()" version for now. I guess
using more and wider unstable interfaces might expose us to greater
risk of silent API/behavior changes or have subtle bugs. If we ever
have a concrete reason to believe that RtlGetLastNtStatus() is not
reliable here, we could reconsider.

2. I dropped the assertion that the signal event has been created
before the first call to the open() wrapper. Instead, I taught
pg_usleep() to fall back to plain old SleepEx() if the signal stuff
isn't up yet. Other solutions are possible of course, but it struck
me as a bad idea to place initialisation ordering constraints on very
basic facilities like open() and stat().

I should point out explicitly that with this patch, stat() benefits
from open()'s tolerance for sharing violations, as a side effect.
That is, it'll retry for a short time in the hope that whoever opened
our file without allowing sharing will soon go away. I don't know how
useful that bandaid loop really is in practice, but I don't see why
we'd want that for open() and not stat(), so this change seems good to
me on consistency grounds at the very least.

3. We fixed the warnings about macro redefinition with #define
UMDF_USING_NTSTATUS and #include <ntstatus.h> in win32_port.h. (Other
ideas considered: (1) Andres reported that it also works to move the
#include to ~12 files that need things from it, ie things that were
suppressed from windows.h by that macro and must now be had from
ntstatus.h, but the files you have to change are probably different in
back branches if we decide to do that, (2) I tried defining that macro
locally in files that need it, *before* including c.h/postgres.h, and
then locally include ntstatus.h afterwards, but that seems to violate
project style and generally seems weird.)

Another thing to point out explicitly is that I added a new file
src/port/win32ntdll.c, which is responsible for fishing out the NT
function pointers. It was useful to be able to do that in the
abandoned NtCreateFile() variant because it needed three of them and I
could reduce boiler-plate noise with a static array of function names
to loop over. In this version the array has just one element, but I'd
still rather centralise this stuff in one place and make it easy to
add any more of these that we eventually find a need for.

BTW, I also plan to help Victor get his "POSIX semantics" patch[1]
into the tree (and extend it to cover more ops). That should make
these problems go away in a more complete way IIUC, but doesn't work
everywhere (not sure if we have any build farm animals where it
doesn't work, if so it might be nice to change that), so it's
complementary to this patch. (My earlier idea that that stuff would
magically start happening for free on all relevant systems some time
soon has faded.)

[1] https://www.postgresql.org/message-id/flat/a529b660-da15-5b62-21a0-9936768210fd%40postgrespro.ru

Attachment Content-Type Size
v3-0001-Check-for-STATUS_DELETE_PENDING-on-Windows.patch text/x-patch 17.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Julien Rouhaud 2021-09-10 05:08:21 Re: parallelizing the archiver
Previous Message kuroda.hayato@fujitsu.com 2021-09-10 04:33:53 RE: Allow escape in application_name