Re: [BUGS] Bug in Physical Replication Slots (at least 9.5)?

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: michael(dot)paquier(at)gmail(dot)com
Cc: nag1010(at)gmail(dot)com, jdnelson(at)dyn(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [BUGS] Bug in Physical Replication Slots (at least 9.5)?
Date: 2017-08-28 11:02:40
Message-ID: 20170828.200240.73003255.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

Hello,

This problem still occurs on the master.
I rebased this to the current master.

At Mon, 3 Apr 2017 08:38:47 +0900, Michael Paquier <michael(dot)paquier(at)gmail(dot)com> wrote in <CAB7nPqT8dQk_Ce29YQ0CKAQ7htLDyUHNdFv6dELe4PkYr3SSjA(at)mail(dot)gmail(dot)com>
> On Mon, Apr 3, 2017 at 7:19 AM, Venkata B Nagothi <nag1010(at)gmail(dot)com> wrote:
> > As we are already past the commitfest, I am not sure, what should i change
> > the patch status to ?
>
> The commit fest finishes on the 7th of April. Even with the deadline
> passed, there is nothing preventing to work on bug fixes. So this item
> ought to be moved to the next CF with the same category.

The steps to reproduce the problem follows.

- Apply the second patch (0002-) attached and recompile. It
effectively reproduces the problematic state of database.

- M(aster): initdb the master with wal_keep_segments = 0
(default), log_min_messages = debug2
- M: Create a physical repslot.
- S(tandby): Setup a standby database.
- S: Edit recovery.conf to use the replication slot above then
start it.
- S: touch /tmp/hoge
- M: Run pgbench ...
- S: After a while, the standby stops.
> LOG: #################### STOP THE SERVER

- M: Stop pgbench.
- M: Do 'checkpoint;' twice.
- S: rm /tmp/hoge
- S: Fails to catch up with the following error.

> FATAL: could not receive data from WAL stream: ERROR: requested WAL segment 00000001000000000000002B has already been removed

The first patch (0001-) fixes this problem, preventing the
problematic state of WAL segments by retarding restart LSN of a
physical replication slot in a certain condition.

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment Content-Type Size
0001-Retard-restart-LSN-of-a-slot-when-a-segment-starts-w.patch text/x-patch 7.3 KB
0002-Debug-assistant-code.patch text/x-patch 1.4 KB

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Michael Paquier 2017-08-28 11:14:54 Re: [BUGS] Bug in Physical Replication Slots (at least 9.5)?
Previous Message Tom Lane 2017-08-27 20:29:41 Re: [BUGS] [postgresql 10 beta3] unrecognized node type: 90

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2017-08-28 11:14:54 Re: [BUGS] Bug in Physical Replication Slots (at least 9.5)?
Previous Message Amit Langote 2017-08-28 10:38:31 Re: expanding inheritance in partition bound order