From: | Heikki Linnakangas <hlinnakangas(at)vmware(dot)com> |
---|---|
To: | Magnus Hagander <magnus(at)hagander(dot)net> |
Cc: | Pg Bugs <pgsql-bugs(at)postgresql(dot)org> |
Subject: | Re: pg_basebackup fails if a data file is removed |
Date: | 2012-12-21 13:38:10 |
Message-ID: | 50D46642.6010008@vmware.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
On 21.12.2012 15:30, Magnus Hagander wrote:
> On Fri, Dec 21, 2012 at 2:28 PM, Heikki Linnakangas
> <hlinnakangas(at)vmware(dot)com> wrote:
>> When pg_basebackup copies data files, it does basically this:
>>
>>> if (lstat(pathbuf,&statbuf) != 0)
>>> {
>>> if (errno != ENOENT)
>>> ereport(ERROR,
>>> (errcode_for_file_access(),
>>> errmsg("could not stat file or directory
>>> \"%s\": %m",
>>> pathbuf)));
>>>
>>> /* If the file went away while scanning, it's no error. */
>>> continue;
>>> }
>>
>>> ...
>>> sendFile(pathbuf, pathbuf + basepathlen + 1,&statbuf);
>>
>> There's a race condition there. If the file is removed after the lstat call,
>> and before sendFile opens the file, the backup fails with an error. It's a
>> fairly tight window, so it's difficult to run into by accident, but by
>> putting a breakpoint with a debugger there it's quite easy to reproduce, by
>> e.g doing a VACUUM FULL on the table about to be copied.
>>
>> A straightforward fix is to allow sendFile() to ignore ENOENT. Patch
>> attached.
>
> Looks good to me.
Ok, committed.
> Nice spot - don't tell me you actually ran into it
> during testing? :)
Heh, no, eyeballing the code.
- Heikki
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2012-12-21 16:47:38 | Re: BUG #7766: Running a DML statement that affects more than 4 billion rows results in an exception |
Previous Message | Magnus Hagander | 2012-12-21 13:30:23 | Re: pg_basebackup fails if a data file is removed |