Re: Sync Rep Design

From: Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: Josh Berkus <josh(at)postgresql(dot)org>, Simon Riggs <simon(at)2ndQuadrant(dot)com>, greg(at)2ndQuadrant(dot)com, Hannu Krosing <hannu(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Sync Rep Design
Date: 2011-01-02 09:20:39
Message-ID: 4D204367.4030308@kaltenbrunner.cc
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 01/02/2011 09:35 AM, Heikki Linnakangas wrote:
> On 02.01.2011 00:40, Josh Berkus wrote:
>> On 1/1/11 5:59 AM, Stefan Kaltenbrunner wrote:
>>> well you keep saying that but to be honest I cannot really even see a
>>> usecase for me - what is "only a random one of a set of servers is sync
>>> at any time and I don't really know which one".
>>> My usecases would al involved 2 sync standbys and 1 or more async ones.
>>> but the second sync one would be in a different datacenter and I NEED to
>>> protect against a datacenter failure which your proposals says I cannot
>>> do :(
>>
>> As far as I know, *nobody* has written the bookkeeping code to actually
>> track which standbys have ack'd. We need to get single-ack synch
>> standby merged, tested and working before we add anything as complicated
>> as "each standby on this list must ack". That means that it's extremely
>> unlikely for 9.1 at this point.
>
> The bookkeeping will presumably consist of an XLogRecPtr in shared
> memory for each standby, tracking how far the standby has acknowledged.
> At commit, you scan the standby slots in shared memory and check that
> the required standbys have acknowledged your commit record. The
> bookkeeping required is the same whether or not we support a list of
> standbys that must ack or just one.
>
>> Frankly, if Simon hadn't already submitted code, I'd be pushing for
>> single-standby-only for 9.1, instead of "any one".
>
> Yes, we are awfully late, but let's not panic.
>
> BTW, there's a bunch of replication related stuff that we should work to
> close, that are IMHO more important than synchronous replication. Like
> making the standby follow timeline changes, to make failovers smoother,
> and the facility to stream a base-backup over the wire. I wish someone
> worked on those...

yeah I agree that those two are much more of a problem for the general
user base. Whatever people think about our current system - it is very
easy to configure(in terms of knobs to toggle) but extremely hard to get
set up and dealt with during failovers(and I know nobody who got it
right the first few times or has not fucked up one thing in the process).
Syncrep is importantant but I would argue that getting those two fixed
is even more so ;)

Stefan

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2011-01-02 09:20:40 Re: Sync Rep Design
Previous Message Heikki Linnakangas 2011-01-02 08:56:21 Re: SSI SLRU low-level functions first cut