Re: .ready files appearing on slaves

From: Jehan-Guillaume de Rorthais <jgdr(at)dalibo(dot)com>
To: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: .ready files appearing on slaves
Date: 2014-09-15 15:37:24
Message-ID: 20140915173724.2dae0dcd@erg
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi hackers,

An issue that seems related to this has been posted on pgsql-admin. See:

http://www.postgresql.org/message-id/CAAS3tyLnXaYDZ0+zhXLPdVtOvHQOvR+jSPhp30o8kvWqQs0Tqw@mail.gmail.com

How can we help on this issue?

Cheers,

On Thu, 4 Sep 2014 17:50:36 +0200
Jehan-Guillaume de Rorthais <jgdr(at)dalibo(dot)com> wrote:

> Hi hackers,
>
> Since few months, we occasionally see .ready files appearing on some slave
> instances from various context. The two I have in mind are under 9.2.x.
>
> I tried to investigate a bit. These .ready files are created when a WAL file
> from pg_xlog has no corresponding file in pg_xlog/archive_status. I could
> easily experience this by deleting such a file: it is created again at the
> next restartpoint or checkpoint received from the master.
>
> Looking at the WAL in pg_xlog folder corresponding to these .ready files, they
> are all much older than the current WAL "cycle" in both mtime and name logic
> sequence. As instance on one of these box we have currently 6 of those "ghost"
> WALs:
>
> 0000000200001E53000000FF
> 0000000200001F18000000FF
> 0000000200002047000000FF
> 00000002000020BF000000FF
> 0000000200002140000000FF
> 0000000200002370000000FF
> 000000020000255D000000A8
> 000000020000255D000000A9
> [...normal WAL sequence...]
> 000000020000255E0000009D
>
> And on another box:
>
> 000000010000040E000000FF
> 0000000100000414000000DA
> 000000010000046E000000FF
> 0000000100000470000000FF
> 00000001000004850000000F
> 000000010000048500000010
> [...normal WAL sequence...]
> 000000010000048500000052
>
> So it seems for some reasons, these old WALs were "forgotten" by the
> restartpoint mechanism when they should have been recylced/deleted.
>
> For one of these servers, I could correlate this with some brutal
> disconnection of the streaming replication appearing in its logs. But there
> was no known SR disconnection on the second one.
>
> Any idea about this weird behaviour? What can we do to help you investigate
> further?
>
> Regards,

--
Jehan-Guillaume de Rorthais
Dalibo
http://www.dalibo.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2014-09-15 16:09:11 Re: Turning off HOT/Cleanup sometimes
Previous Message Alexander Korotkov 2014-09-15 15:28:20 Collation-aware comparisons in GIN opclasses