Re: BUG #5038: WAL file is pending deletion in pg_xlog folder, this interferes with WAL archiving.

From: Luke Koops <luke(dot)koops(at)entrust(dot)com>
To: Luke Koops <luke(dot)koops(at)entrust(dot)com>, 'Tom Lane' <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: "pgsql-bugs(at)postgresql(dot)org" <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: BUG #5038: WAL file is pending deletion in pg_xlog folder, this interferes with WAL archiving.
Date: 2009-09-09 18:36:23
Message-ID: A3144629B5AC714A8BF27806EBFA7057514623BC@sottexch7.corp.ad.entrust.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

For those of you who are still looking at this, I tried to reproduce the issue by holding one of the WAL files open with another program (just opened it with the cygwin build of less.exe for windows). That didn't do the trick. It prevented unlink or rename from working at all. I wrote a program (open.exe) that opens the file using pgwin32_open() and passed in the same parameters that postgres uses when opening a WAL file. That allowed the file to be renamed. And, when deleted, the open file went into the pending deletion state.

I used open.exe to hold onto a WAL file that was going to be recycled. The recycling worked, but what is going to happen down the road when the handle is released, leaving a gap in the WAL file sequence. Or if it is not released, when a backend tries to open the WAL file and does not have access to it?

When open.exe was holding onto a WAL file that was going to be deleted, the deletion worked. The file went into the deletion pending state. The archive status for the WAL file went through the .ready ==> .done ==> {no status file} ==> .ready sequence. At that point Postgres repeatedly tries to archive the WAL file.

I reported earlier that I believe postgres leaked the file handle to the WAL file. I don't believe that is the case. We have a process that only checks data in the database for integrity. It is only reading. I think it opened the WAL file initially, perhaps the backend had some maintenance work to do when that session started and had to write something to the WAL and then never moved on to a new one.

Now that I can reproduce the pending deletion case, I'm working on code to detect it reliably and, hopefully, efficiently.

-Luke
> -----Original Message-----
> From: pgsql-bugs-owner(at)postgresql(dot)org
> [mailto:pgsql-bugs-owner(at)postgresql(dot)org] On Behalf Of Luke Koops
> Sent: Monday, September 07, 2009 4:30 PM
> To: 'Tom Lane'; Heikki Linnakangas
> Cc: pgsql-bugs(at)postgresql(dot)org
> Subject: Re: [BUGS] BUG #5038: WAL file is pending deletion
> in pg_xlog folder, this interferes with WAL archiving.
>
> > -----Original Message-----
> > From: Tom Lane [mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us]
> > Sent: Monday, September 07, 2009 1:17 PM
> > To: Heikki Linnakangas
> > Cc: Luke Koops; pgsql-bugs(at)postgresql(dot)org
> > Subject: Re: [BUGS] BUG #5038: WAL file is pending deletion
> in pg_xlog
> > folder, this interferes with WAL archiving.
> >
> > Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com> writes:
> > > Perhaps we should try to close the old WAL file sooner.
> >
> > There is zero hope of making that work. What we probably
> need to do
> > is fix the code that scans pg_xlog so that it ignores files
> that are
> > pending deletion. (I assume there's some way to find that out on
> > Windows.)
> On *nux systems, unlink removes the link from the directory.
> When other processes get a directory listing, the file will
> no longer be listed. On Windows, the file name continues to
> show up in directory listings. The file is in a state called
> pending deletion. Windows documentation doesn't give a
> specific test for this state. Perhaps you could use _access().
> ====================
> From http://support.microsoft.com/kb/159199
>
> This file is in a state known as pending deletion. This file
> has been deleted, but there are still handles open to it.
> NTFS will wait until all handles to this file are closed
> before updating the index. If an attempt is made to access
> the file, however, NTFS will deny the attempt. Because the
> file is listed in the index, but is effectively deleted, you
> can see the file but you cannot access it.
>
> Windows NT returns an "Access Denied" error message when you
> attempt to manipulate the file. You are not able to view the
> permissions, the owner, or the contents of the file. The file
> does, however, show up in a DIR listing in File Manager and
> in Explorer. This occurs even though the user trying to
> access the file has permissions to the file. Even an
> administrator will be unable to take ownership of this file.
> ====================
> >
> > regards, tom lane
> >
>
> --
> Sent via pgsql-bugs mailing list (pgsql-bugs(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-bugs
>

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Keh-Cheng Chu 2009-09-09 18:51:52 need higher extra_float_digits value (3)
Previous Message Jim Mlodgenski 2009-09-09 16:21:01 Re: BUG #5045: java developer