Re: Streaming replication - unable to stop the standby

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Streaming replication - unable to stop the standby
Date: 2010-05-03 18:49:35
Message-ID: AANLkTimRQAtM-EEnVYxTgBqQlgAYW74rRAX1PVyP00ko@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, May 3, 2010 at 2:47 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> Hmm.  When I committed that patch to fix smart shutdown on the
>> standby, we discussed the fact that the startup process can't simply
>> release its locks and die at shutdown time because the locks it holds
>> prevent other backends from seeing the database in an inconsistent
>> state.  Therefore, if we were to terminate recovery as soon as the
>> smart shutdown request is received, we might never complete, because a
>> backend might be waiting on a lock that will never get released.  If
>> that's really a danger scenario, then it follows that we might also
>> fail to shut down if we can't connect to the primary, because we might
>> not be able to replay enough WAL to release the locks the remaining
>> backends are waiting for.  That sort of looks like what is happening
>> to you, except based on your test scenario I can't figure out where
>> this came from:
>
>> FATAL:  replication terminated by primary server
>
> I suspect you have it right, because my experiments where the standby
> did shut down correctly were all done with an idle master.
>
> Seems like we could go ahead and forcibly kill the startup process *once
> all the standby backends are gone*.  There is then no need to worry
> about not releasing locks, and re-establishing a consistent state when
> we later restart is logic that we have to have anyway.

That's exactly what we already do. The problem is that smart shutdown
doesn't actually kill off the standby backends - it waits for them to
exit on their own. Except, if they're blocking on a lock that's never
going to get released, then they never do.

...Robert

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Kevin Grittner 2010-05-03 18:54:34 Re: pg_start_backup and pg_stop_backup Re: Re: [COMMITTERS] pgsql: Make CheckRequiredParameterValues() depend upon correct
Previous Message Tom Lane 2010-05-03 18:47:14 Re: Streaming replication - unable to stop the standby