Re: BUG #5038: WAL file is pending deletion in pg_xlog folder, this interferes with WAL archiving.

From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To:
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Luke Koops <luke(dot)koops(at)entrust(dot)com>, "pgsql-bugs(at)postgresql(dot)org" <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: BUG #5038: WAL file is pending deletion in pg_xlog folder, this interferes with WAL archiving.
Date: 2009-09-10 09:44:27
Message-ID: 4AA8CA7B.4020608@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Heikki Linnakangas wrote:
> Tom Lane wrote:
>> Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com> writes:
>>> No, it's a backend that's holding the file open, with FILE_SHARE_DELETE.
>> If that's the only case we care about covering, then rename might be
>> enough. I was just wondering what it would take to solve the more
>> general problem of something holding it open with the wrong flags
>> at the time we want to get rid of it.
>
> Yes, that's a separate problem, and I think we should address that too.
> That's what I thought was going on in OP's case at first, the patch I
> posted in my first reply should address that.
>
> I'll try to reproduce that case too, and verify that the patch fixes it.

Ok, I've committed a patch along those lines. The file is now renamed
before unlinking (on Windows), and the return code of rename() and
unlink() is checked, so that we don't delete the .done file if the WAL
file deletion failed. This fixes both scenarios, the one OP reported
with another backend keeping the file open, and the one where a
different process keeps a file open without FILE_SHARE_DELETE.

I considered making failure to rename or delete a WARNING instead of
ERROR, so that RemoveOldXLogFiles() would still clean up any other old
WAL files. However, when a file is recycled, we throw an error anyway if
the rename fails in InstallXLogFileSegment(), so it doesn't seem like it
would buy us much.

BTW, it seems that errno is not set on Windows when rename fails, but we
still try to print the OS error message in InstallXLogFileSegment().
When I tested the case where another process is keeping the file locked,
for example, I got this:

ERROR: could not rename file "pg_xlog/000000010000000100000073" to
"pg_xlog/000000010000000100000092" (initialization of log file 1,
segment 146): No such file or directory

even though the file clearly exists, it's just locked. I'm not sure
where errno is coming from in that case, and if we should do something
about that, but that exceeds my appetite for fixing Windows issues right
now.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Rohan jamadagni 2009-09-10 11:17:01 BUG #5047: Not able to connect from Informatica
Previous Message Heikki Linnakangas 2009-09-10 05:36:23 Re: BUG #5038: WAL file is pending deletion in pg_xlog folder, this interferes with WAL archiving.