Re: win32 _dosmaperr()

From: "Qingqing Zhou" <zhouqq(at)cs(dot)toronto(dot)edu>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: win32 _dosmaperr()
Date: 2005-08-05 02:33:01
Message-ID: dcuj99$1hfr$1@news.hub.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


""Magnus Hagander"" <mha(at)sollentuna(dot)net> writes
>
> I suggest you try using Process Explorer from www.sysinternals.com to
> figure out who has the file open. Most of the time it should be able to
> tell you exactly who has locked the file - at least as long as it's done
> from userspace. I'm not 100% sure on how it deals with kernel level
> locks.
>

After runing PG win32 (8.0.1) sever for a while and mix some heavy
transactions like checkpoint, vacuum together, I encountered another problem
should be in the same category. PG reports:

"could not unlink 0000xxxx, continuing to try"

at dirmod.c/pgunlink() and deadloops there. I use the PE tool you mentioned,
I found there are only 3 processes hold the handle of the problematic xlog
segment, all of them are postgres backends. Using the FileMon tool from the
same website, I found that bgwriter tried to OPEN the xlog segment with ALL
ACCESS but failed with result DELETE PEND.

That is to say, under some conditions, even if I opened file with
SHARED_DELETE flag, I may not remove the file when it is open? I did some
tests, but every time I delete/rename an opened file, I could make it.

Things could get worse because the whole database cluster may stop working
and waiting for the buffer the bgwriter is working on, but bgwriter is
waiting for (by the deadloop in pgunlink) those postgres'es to move on (so
that they could close the problematic xlog segment), which is a deadlock.

Regards,
Qingqing

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Reini Urban 2005-08-05 05:43:22 Re: Cygwin - make check broken
Previous Message Qingqing Zhou 2005-08-05 01:57:23 Re: prevent encoding conversion recursive error