Re: BUG #14243: pg_basebackup failes by a STATUS_DELETE_PENDING file

From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, TAKATSUKA Haruka <harukat(at)sraoss(dot)co(dot)jp>, PostgreSQL Bugs <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: BUG #14243: pg_basebackup failes by a STATUS_DELETE_PENDING file
Date: 2016-08-22 13:17:21
Message-ID: CABUevEzwkhp0Ev-xxuSDQueFKBYs8=FEthxrys6MOvReZNXK8A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Mon, Aug 22, 2016 at 3:13 PM, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
wrote:

> On Fri, Aug 19, 2016 at 4:15 PM, Michael Paquier
> <michael(dot)paquier(at)gmail(dot)com> wrote:
> > On Thu, Aug 18, 2016 at 7:07 PM, Magnus Hagander <magnus(at)hagander(dot)net>
> wrote:
> >> On Thu, Aug 18, 2016 at 2:25 AM, Michael Paquier <
> michael(dot)paquier(at)gmail(dot)com>
> >> wrote:
> >>> On Tue, Aug 16, 2016 at 4:56 PM, Magnus Hagander <magnus(at)hagander(dot)net>
> >>> wrote:
> >>> > On Wed, Jul 13, 2016 at 3:10 AM, Michael Paquier
> >>> > <michael(dot)paquier(at)gmail(dot)com>
> >>> > wrote:
> >> That's pretty much exactly what I said, isn't it? :)
> >
> > :p
> >
> >>> lstat() needs a similar treatment, see sendTablespace that calls it.
> >>> port.h needs also to have its comments updated.
> >>
> >> port/win32.h has:
> >> /*
> >> * Supplement to <sys/stat.h>.
> >> */
> >> #define lstat(path, sb) stat((path), (sb))
> >>
> >> So I don't think we should need that, no?
> >
> > Right, this mapping may be caused by the fact that WIN32 uses junction
> > points. Is that right?
> >
> >>> I think that your patch is definitely an improvement, and that we may
> >>> have better backpatch it: any extension relying on those calls is
> >>> going to fail similarly, so this gives an escape path. To be honest, I
> >>> have not thought of GetLastError() and check that with
> >>> STATUS_DELETE_PENDING, but that's definitely the right call to ignore
> >>> a file that has this status instead of failing with EACCESS.
> >>>
> >>
> >> So, thinking more about that.
> >>
> >> AllocateFile() calls fopen(). That one is only documented to set errno,
> and
> >> might not always set the value used by GetLastError().
> >>
> >> Perhaps we should make AllocateFile() on win32 specifically check for
> >> EACCESS, and in case we get that error then we do a
> GetFileAttributesEx() on
> >> the same file and see if we get STATUS_DELETE_PENDING or not. If we get
> >> STATUS_DELETE_PENDING we rewrite EACCESS to ENOENT and return, if we get
> >> anything else we just let it through.
> >
> > That's a good idea.
>
> So, working more on it I have not been able to reproduce the failure
> reported even after running pg_basebackup in loop with a pgbench
> running on a single client with this script to create a bunch of
> relfilenodes:
> insert into hoge values (1);
> truncate hoge;
> max_file_size and checkpoint_timeout were also set to minimum to
> increase checkpoint frequency.
>
> Anyway, I have spent some time eye-balling the patch proposed by
> Magnus and I think that it is over-complicated. First I have noticed
> here something:
> https://msdn.microsoft.com/en-us/library/windows/desktop/
> ms681382(v=vs.85).aspx
> There is an error code ERROR_DELETE_PENDING that we can use to perform
> the checks we want, and I am noticing that this is not getting listed
> in win32error.c, so we could just map that with ENOENT and we are
> basically done: AllocateFile() calls pgwin32_fopen() and stat() calls
> win32_safestats, both of them finishing with _dosmaperr() to be sure
> that errno is correctly set. This results in the simple patch
> attached.
>

Not having looked in detail, but in pgwin32_safestat(), if the stat() call
fails, we return immediately without calling _dosmaperr(), don't we? So
we're still going to error out there with whatever the default mapping is,
and that's access denied.

It's only if the the stat() call succeeds but the getting of extended
attributes fail that we actually call _dosmaperr().

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Michael Paquier 2016-08-22 13:42:04 Re: BUG #14243: pg_basebackup failes by a STATUS_DELETE_PENDING file
Previous Message Michael Paquier 2016-08-22 13:13:44 Re: BUG #14243: pg_basebackup failes by a STATUS_DELETE_PENDING file