| From: | Andres Freund <andres(at)2ndquadrant(dot)com> | 
|---|---|
| To: | Heikki Linnakangas <hlinnakangas(at)vmware(dot)com> | 
| Cc: | Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org> | 
| Subject: | Re: Slave enters in recovery and promotes when WAL stream with master is cut + delay master/slave | 
| Date: | 2013-01-17 16:55:07 | 
| Message-ID: | 20130117165507.GD22844@awork2.anarazel.de | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-hackers | 
On 2013-01-17 18:50:35 +0200, Heikki Linnakangas wrote:
> On 17.01.2013 18:42, Andres Freund wrote:
> >On 2013-01-17 18:33:42 +0200, Heikki Linnakangas wrote:
> >>On 17.01.2013 17:42, Andres Freund wrote:
> >>>Ok, the attached patch seems to fix a) and b). c) above is bogus, as
> >>>explained in a comment in the patch.  I also noticed that the TLI check
> >>>didn't mark the last source as failed.
> >>
> >>This looks fragile:
> >>
> >>>			/*
> >>>			 * We only end up here without a message when XLogPageRead() failed
> >>>			 * - in that case we already logged something.
> >>>			 * In StandbyMode that only happens if we have been triggered, so
> >>>			 * we shouldn't loop anymore in that case.
> >>>			 */
> >>>			if (errormsg == NULL)
> >>>				break;
> >>
> >>I don't like relying on the presence of an error message to control logic
> >>like that. Should we throw in an explicit CheckForStandbyTrigger() check in
> >>the condition of that loop?
> >
> >I agree, I wasn't too happy about that either. But in some sense its
> >only propagating state from XLogReadPage which already has dealt with
> >the error and decided its ok.
> >Its the solution closest to what happened in the old implementation,
> >thats why I thought it would be halfway to acceptable.
> >
> >Adding the CheckForStandbyTrigger() in the condition would mean
> >promotion would happen before all the available records are processed
> >and it would increase the amount of stat()s tremendously.
> >So I don't really like that either.
> 
> I was thinking of the attached. As long as we check for
> CheckForStandbyTrigger() after the "record == NULL" check, we won't perform
> extra stat() calls on successful reads, only when we're polling after
> reaching the end of valid WAL. That seems acceptable. 
Looks good to me.
Greetings,
Andres Freund
-- 
 Andres Freund	                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Dimitri Fontaine | 2013-01-17 17:06:50 | Re: Event Triggers: adding information | 
| Previous Message | Heikki Linnakangas | 2013-01-17 16:50:35 | Re: Re: Slave enters in recovery and promotes when WAL stream with master is cut + delay master/slave |