Re: Sync Rep Design

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Aidan Van Dyk <aidan(at)highrise(dot)ca>
Cc: Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Sync Rep Design
Date: 2010-12-31 16:56:38
Message-ID: 1293814598.1892.41334.camel@ebony
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, 2010-12-31 at 07:33 -0500, Aidan Van Dyk wrote:
> On Fri, Dec 31, 2010 at 5:26 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>
> > Your picture above is a common misconception. I will add something to
> > the docs to explain this.
>
> > 2. "sync" does not guarantee that the updates to the standbys are in any
> > way coordinated. You can run a query on one standby and get one answer
> > and at the exact same time run the same query on another standby and get
> > a different answer (slightly ahead/behind). That also means that if the
> > master crashes one of the servers will be ahead or behind. You can use
> > pg_last_xlog_receive_location() to check which one that is.
> >
> > When people say they want *all* servers to respond, its usually because
> > they want (2), but that is literally impossible in a distributed system.
>
> Just to try and be clear again, in "sync" that Stefan and I are
> talking about, we really don't care that the slave could be a "hot
> standby" answering queries. In fact, mine wouldn't be. Mine would
> likely be pg_streamrecv or something. I'm just looking for a
> guarantee that I've got a copy of the data safely in the next rack,
> and a separate building before I tell the client I've moved his money.
>
> I want a synchronous replication of the *data*, and not a system where
> I can distribute queries. I'm looking for disaster mitigation, not
> load mitigation. A replacement for clustered/replicated
> devices/filesystems under pg_xlog.
>
> Having the next rack slave be "hot" in terms of applying WAL and ready
> to take over instantly would be a bonus, as long as I can guarantee
> it's current (i.e has all data a primary's COMMIT has acknowledged).

> So, that's what I want, and that's what your docs suggest is
> impossible currently; 1st past post means that I can only ever
> reliably configure 1 sync slave and be sure it will have all
> acknowledged commits. I can likely get *close* to that by putting
> only my "slowest" slave as the only sync slave, and monitoring the
> heck out of my "asynchronous but I want to be synchronous" slave, but
> I'ld rather trust the PG community to build robust synchronization
> than myself to build robust enough monitoring to catch that my slave
> is farther behind than the slower synchronous one.

> That said, I think the expectation is that if I were building a
> query-able "hot standby" cluster in sync rep mode, once I get a commit
> confirmation, I should be able to then initiate a new transaction on
> any member of that sync rep cluster and see the data I just committed.

> Yes, I know I could see *newer* data. And I know that the primary
> could already have newer data. Yes, we have the problem even on a
> single pg cluster on a single machine. But the point is that if
> you've committed, any new transactions see *at least* that data or
> newer. But no chance of older.
>
> But personally, I'm not interested in that ;-)

I understand your requirements, listed above.

There are good technical reasons why trying to achieve *all* of the
above lets slip the other unstated requirements of availability,
complexity, performance etc..

Inventing parameter combinations merely hides the fact that these things
aren't all simultaneously achievable. In light of that, I have been
espousing a simple approach to the typical case, and for the first
release. I can see that people may assume my words have various other
reasons behind them, but that's not the case. If I could give it all to
you, I would.

--
Simon Riggs http://www.2ndQuadrant.com/books/
PostgreSQL Development, 24x7 Support, Training and Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message David E. Wheeler 2010-12-31 17:56:42 Re: contrib/snapshot
Previous Message Joachim Wieland 2010-12-31 14:38:51 Re: Snapshot synchronization, again...