Re: Hot Standby startup with overflowed snapshots

From: Chris Redekop <chris(at)replicon(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Hot Standby startup with overflowed snapshots
Date: 2011-10-28 02:42:47
Message-ID: CAC2SuRLM07gseDBeyqTL2AfkQmOHvkcNvpBV_qRxoPkO76-FwA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Sorry..."designed" was poor choice of words, I meant "not unexpected".
Doing the checkpoint right after pg_stop_backup() looks like it will work
perfectly for me, so thanks for all your help!

On a side note I am sporadically seeing another error on hotstandby startup.
I'm not terribly concerned about it as it is pretty rare and it will work
on a retry so it's not a big deal. The error is "FATAL: out-of-order XID
insertion in KnownAssignedXids". If you think it might be a bug and are
interested in hunting it down let me know and I'll help any way I can...but
if you're not too worried about it then neither am I :)

On Thu, Oct 27, 2011 at 4:55 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:

> On Thu, Oct 27, 2011 at 10:09 PM, Chris Redekop <chris(at)replicon(dot)com>
> wrote:
>
> > hrmz, still basically the same behaviour. I think it might be a *little*
> > better with this patch. Before when under load it would start up quickly
> > maybe 2 or 3 times out of 10 attempts....with this patch it might be up
> to 4
> > or 5 times out of 10...ish...or maybe it was just fluke *shrug*. I'm
> still
> > only seeing your log statement a single time (I'm running at debug2). I
> > have discovered something though - when the standby is in this state if I
> > force a checkpoint on the primary then the standby comes right up. Is
> there
> > anything I check or try for you to help figure this out?....or is it
> > actually as designed that it could take 10-ish minutes to start up even
> > after all clients have disconnected from the primary?
>
> Thanks for testing. The improvements cover specific cases, so its not
> subject to chance; its not a performance patch.
>
> It's not "designed" to act the way you describe, but it does.
>
> The reason this occurs is that you have a transaction heavy workload
> with occasional periods of complete quiet and a base backup time that
> is much less than checkpoint_timeout. If your base backup was slower
> the checkpoint would have hit naturally before recovery had reached a
> consistent state. Which seems fairly atypical. I guess you're doing
> this on a test system.
>
> It seems cheap to add in a call to LogStandbySnapshot() after each
> call to pg_stop_backup().
>
> Does anyone think this case is worth adding code for? Seems like one
> more thing to break.
>
> --
> Simon Riggs http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Training & Services
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephen Frost 2011-10-28 02:45:34 Re: pg_upgrade if 'postgres' database is dropped
Previous Message Bruce Momjian 2011-10-28 02:40:25 Re: pg_upgrade if 'postgres' database is dropped