Re: Inconsistent DB data in Streaming Replication

From: Amit Kapila <amit(dot)kapila(at)huawei(dot)com>
To: "'Florian Pflug'" <fgp(at)phlo(dot)org>
Cc: "'Fujii Masao'" <masao(dot)fujii(at)gmail(dot)com>, "'Andres Freund'" <andres(at)2ndquadrant(dot)com>, "'Hannu Krosing'" <hannu(at)2ndquadrant(dot)com>, "'Sameer Thakur'" <samthakur74(at)gmail(dot)com>, "'Ants Aasma'" <ants(at)cybertec(dot)at>, <sthomas(at)optionshouse(dot)com>, "'Tom Lane'" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "'Samrat Revagade'" <revagade(dot)samrat(at)gmail(dot)com>, "'PostgreSQL-development'" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Inconsistent DB data in Streaming Replication
Date: 2013-04-17 13:11:12
Message-ID: 005601ce3b6d$09785c40$1c6914c0$@kapila@huawei.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wednesday, April 17, 2013 4:19 PM Florian Pflug wrote:
> On Apr17, 2013, at 12:22 , Amit Kapila <amit(dot)kapila(at)huawei(dot)com> wrote:
> > Do you mean to say that as an error has occurred, so it would not be
> able to
> > flush received WAL, which could result in loss of WAL?
> > I think even if error occurs, it will call flush in WalRcvDie(),
> before
> > terminating WALReceiver.
>
> Hm, true, but for that to prevent the problem the inner processing
> loop needs to always read up to EOF before it exits and we attempt
> to send a reply. Which I don't think it necessarily does. Assume,
> that the master sends a chunk of data, waits a bit, and finally
> sends the shutdown record and exits. The slave might then receive
> the first chunk, and it might trigger sending a reply. At the time
> the reply is sent, the master has already sent the shutdown record
> and closed the connection, and we'll thus fail to reply and abort.
> Since the shutdown record has never been read from the socket,
> XLogWalRcvFlush won't flush it, and the slave ends up behind the
> master.
>
> Also, since XLogWalRcvProcessMsg responds to keep-alives messages,
> we might also error out of the inner processing loop if the server
> closes the socket after sending a keepalive but before we attempt
> to respond.
>
> Fixing this on the receive side alone seems quite messy and fragile.
> So instead, I think we should let the master send a shutdown message
> after it has sent everything it wants to send, and wait for the client
> to acknowledge it before shutting down the socket.
>
> If the client fails to respond, we could log a fat WARNING.

Your explanation seems to be okay, but I think before discussing the exact
solution,
If the actual problem can be reproduced, then it might be better to discuss
this solution.

With Regards,
Amit Kapila.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2013-04-17 13:14:14 Re: TODO links broken?
Previous Message Magnus Hagander 2013-04-17 12:48:02 Re: TODO links broken?