Re: [streaming replication] 9.1.3 streaming replication bug ?

From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: 乔志强 <qiaozhiqiang(at)leadcoretech(dot)com>
Cc: PostgreSQL pg-general List <pgsql-general(at)postgresql(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [streaming replication] 9.1.3 streaming replication bug ?
Date: 2012-04-11 15:56:11
Message-ID: CAHGQGwHQraLqGnNDEM6kav_UnouQ76bdcvxKxOEeGgYWOE_5Bw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

On Wed, Apr 11, 2012 at 3:31 PM, 乔志强 <qiaozhiqiang(at)leadcoretech(dot)com> wrote:
> So in sync streaming replication, if master delete WAL before sent to the only standby, all transaction will fail forever,
> "the master tries to avoid a PANIC error rather than termination of replication." but in sync replication, termination of replication is THE bigger PANIC error.

I see your point. When there are backends waiting for replication, the WAL files
which the standby might not have received yet must not be removed. If they are
removed, replication keeps failing forever because required WAL files don't
exist in the master, and then waiting backends will never be released unless
replication mode is changed to async. This should be avoided.

To fix this issue, we should prevent the master from deleting the WAL files
including the minimum waiting LSN or bigger ones. I'll think more and implement
the patch.

Regards,

--
Fujii Masao

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Sergey Konoplev 2012-04-11 16:03:22 Re: Multiple Slave Failover with PITR
Previous Message Michael Nolan 2012-04-11 15:55:15 Re: [HACKERS] [streaming replication] 9.1.3 streaming replication bug ?

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2012-04-11 15:59:17 Re: Last gasp
Previous Message Joshua Berkus 2012-04-11 15:55:56 Re: Last gasp