Magnus Hagander wrote:
> On Thu, Jan 14, 2010 at 15:36, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> On Thu, Jan 14, 2010 at 9:15 AM, Heikki Linnakangas
>> <heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
>>> Imagine this scenario:
>>> 1. Master is up and running, standby is connected and streaming happily
>>> 2. Network goes down, connection is broken.
>>> 3. Standby falls behind a lot. Old WAL files that the standby needs are
>>> archived, and deleted from master.
>>> 4. Network is restored. Standby reconnects
>>> 5. Standby will get an error because the WAL file it needs is not in the
>>> master anymore.
>>> What will currently happen is:
>>> 6, Standby retries connecting and failing indefinitely, until the admin
>>> restarts it.
>>> What we would *like* to happen is:
>>> 6. Standby fetches the missing WAL files from archive, then reconnects
>>> and continues streaming.
>>> Can we fix that?
>> Just MHO here, but this seems like a bigger project than we should be
>> starting at this stage of the game.
> We want this eventually (heck, it'd be awesome!), but let's get what
> we have now stable first.
If we don't fix that within the server, we will need to document that
caveat and every installation will need to work around that one way or
another. Maybe with some monitoring software and an automatic restart. Ugh.
I wasn't really asking if it's possible to fix, I meant "Let's think
about *how* to fix that".
In response to
pgsql-hackers by date
|Next:||From: Bruce Momjian||Date: 2010-01-14 15:50:07|
|Subject: archive_timeout behavior for no activity|
|Previous:||From: Magnus Hagander||Date: 2010-01-14 15:09:43|
|Subject: Re: mailing list archiver chewing patches|