Re: SV: Log files polluted with permission denied error messages after every 10 seconds

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Andrus <kobruleht2(at)hot(dot)ee>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Niels Jespersen <NJN(at)dst(dot)dk>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: SV: Log files polluted with permission denied error messages after every 10 seconds
Date: 2021-03-16 22:18:15
Message-ID: YFEupw8fHXQ/TMLI@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Tue, Mar 16, 2021 at 01:59:07PM +0200, Andrus wrote:
> I have two Windows 2019 servers. In Intel Xeon Cold 6226R server it occurs
> after every 10 seconds. Last logs:
>
> 2021-03-16 13:48:12 EET     checkpointer LOG:  could not rename file
> "pg_wal/000000010000001100000097": Permission denied
> 2021-03-16 13:48:22 EET     checkpointer LOG:  could not rename file
> "pg_wal/000000010000001100000098": Permission denied
> 2021-03-16 13:48:32 EET     checkpointer LOG:  could not rename file
> "pg_wal/000000010000001100000099": Permission denied
> 2021-03-16 13:48:42 EET     checkpointer LOG:  could not rename file
> "pg_wal/00000001000000110000009A": Permission denied
> 2021-03-16 13:48:52 EET     checkpointer LOG:  could not rename file
> "pg_wal/00000001000000110000009D": Permission denied
> 2021-03-16 13:49:02 EET     checkpointer LOG:  could not rename file
> "pg_wal/0000000100000011000000A0": Permission denied
>
> So It should be probably reproducible in any Windows 2019 server.

Those ten seconds are coming from RemoveXlogFile(), where pgrename()
loops 100 times for 100ms before giving up. So something holding up
the file's handle prevents the removal to happen. Attached is the
patch that should be tested, based on the suspected commit. There are
actually two scenarios to worry about:
- Check that the code of 13.2 compiled manually is enough to see the
failure.
- Check that once the patch attached is applied makes the failure go
away.

I am trying on my side to reproduce the problem in a more reliable
way. One thing I saw breaking in my setup is archive_command, where
it was not able to archive a segment with a simple copy, failing with
the same error as yours.

In one of those servers, do you have in pg_wal/ some files named
xlogtemp.N? N is an integer that would be the PID of the process that
generated it.
--
Michael

Attachment Content-Type Size
0001-Revert-Remove-HAVE_WORKING_LINK.patch text/x-diff 5.3 KB

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Andrus 2021-03-16 23:09:24 Re: SV: Log files polluted with permission denied error messages after every 10 seconds
Previous Message Tom Lane 2021-03-16 19:00:55 Re: Binary encoding of timetz type