Re: sync rep design architecture (was "disposition of remaining patches")

From: Greg Smith <greg(at)2ndquadrant(dot)com>
To: Daniel Farina <daniel(at)heroku(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: sync rep design architecture (was "disposition of remaining patches")
Date: 2011-02-25 16:40:27
Message-ID: 4D67DB7B.8030801@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Daniel Farina wrote:
> Server A syncreps to Server B
>
> Now I want to provision server A-prime, which will eventually take the
> place of A.
>
> Server A syncreps to Server B
> Server A syncreps to Server A-prime
>
> Right now, as it stands, the syncrep patch will be happy as soon as
> the data has been fsynced to either B or A-prime; I don't think we can
> guarantee at any point that A-prime can become the leader, and feed B.
>

One of the very fundamental breaks between how this patch implements
sync rep and what some people might expect is this concern. Having such
tight control over the exact order of failover isn't quite here yet, so
sometimes people will need to be creative to work within the
restrictions of what is available. The path for this case is probably:

1) Wait until A' is caught up
2) Switchover to B as the right choice to be the new master, with A' as
its standby and A going off-line at the same time.
3) Switchover the master role from B to A'. Bring up B as its standby.

There are other possible transition plans available too.

I appreciate that you would like to do this as an atomic operation,
rather than handling it as two steps--one of which puts you in a middle
point where B, a possibly inferior standby, is operating at the master.
There are a dozen other complicated "my use case says I want <X> and it
must be done as <Y>" requests for Sync Rep floating around here, too.
They're all getting ignored in favor of something smaller that can get
built today.

The first question I'd ask is whether you could you settle for this more
cumbersome than you'd prefer switchover plan for now. The second is
whether implementing what this feature currently does would get in the
way of coding of what you really want eventually.

I didn't get the Streaming Rep + Hot Standby features I wanted in 9.0
either. But committing what was reasonable to include in that version
let me march forward with very useful new code, doing another year of
development on my own projects and getting some new things get fixed in
core. And so far it looks like 9.1 will sort out all of the kinks I was
unhappy about. The same sort of thing will need to happen to get Sync
Rep committed and then appropriate for more use cases. There isn't any
margin left for discussions of scope creep left here; really it's "is
this subset useful for some situations and stable enough to commit" now.

> 2. The unprivileged user can disable syncrep, in any situation. This
> flexibility is *great*, but you don't really want people to do it when
> one is performing the switchover.

For the moment you may have to live with a situation where user
connections must be blocked during the brief moment of switchover to
eliminate this issue. That's what I end up doing with 9.0 production
systems to get a really clean switchover, there's a second of hiccup
even in the best case. I'm not sure yet of the best way yet to build a
UI to make that more transparent in the sync rep case. It's sure not a
problem that's going to get solved in this release though.

--
Greg Smith 2ndQuadrant US greg(at)2ndQuadrant(dot)com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Cédric Villemain 2011-02-25 16:45:43 Re: WIP: cross column correlation ...
Previous Message Marko Tiikkaja 2011-02-25 16:24:04 Re: wCTE behaviour