Re: Some problems about cascading replication

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Some problems about cascading replication
Date: 2011-08-16 13:25:15
Message-ID: CA+U5nMKtbD++BOma59KZqB6UMzxapj9VL7ZFd1kV7aP6KwAcoA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Aug 16, 2011 at 9:55 AM, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:

> When I tested the PITR on git master with max_wal_senders > 0,
> I found that the following inappropriate log meesage was always
> output even though cascading replication is not in progress. Attached
> patch fixes this problem.
>
>    LOG:  terminating all walsender processes to force cascaded
> standby(s) to update timeline and reconnect
>
> When making the patch, I found another problem about cascading
> replication; When promoting a cascading standby, postmaster sends
> SIGUSR2 to any cascading walsenders to kill them. But there is a
> orner-case where such walsender fails to receive SIGUSR2 and
> survives a standby promotion unexpectedly. This happens when
> postmaster sends SIGUSR2 before the walsender marks itself as
> a WAL sender, because postmaster sends SIGUSR2 to only the
> processes marked as a WAL sender.
>
> To avoid the corner-case, I changed walsender so that it checks
> whether recovery is in progress or not again after marking itself
> as a WAL sender. If recovery is not in progress even though the
> walsender is cascading one, it does the same thing as SIGUSR2
> signal handler does, and then exits later. Attached patch also includes
> this fix.

Looks like valid problems and appropriate fixes to me. Will commit.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2011-08-16 13:25:37 Re: pg_stat_replication vs StandbyReplyMessage
Previous Message Robert Haas 2011-08-16 13:21:39 Re: src/backend/storage/ipc/README