Re: Configuring synchronous replication

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, jd(at)commandprompt(dot)com, Thom Brown <thom(at)linux(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Dave Page <dpage(at)pgadmin(dot)org>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Configuring synchronous replication
Date: 2010-09-24 12:38:15
Message-ID: AANLkTi=DhqLuG+R4zX-cRJCEyECDmZnUk1cjM5ZJV9vJ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers pgsql-hackers

On Fri, Sep 24, 2010 at 6:37 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>> > Earlier you argued that centralizing parameters would make this nice and
>> > simple. Now you're pointing out that we aren't centralizing this at all,
>> > and it won't be simple. We'll have to have a standby.conf set up that is
>> > customised in advance for each standby that might become a master. Plus
>> > we may even need multiple standby.confs in case that we have multiple
>> > nodes down. This is exactly what I was seeking to avoid and exactly what
>> > I meant when I asked for an analysis of the failure modes.
>>
>> If you're operating on the notion that no reconfiguration will be
>> necessary when nodes go down, then we have very different notions of
>> what is realistic.  I think that "copy the new standby.conf file in
>> place" is going to be the least of the fine admin's problems.
>
> Earlier you argued that setting parameters on each standby was difficult
> and we should centralize things on the master. Now you tell us that
> actually we do need lots of settings on each standby and that to think
> otherwise is not realistic. That's a contradiction.

You've repeatedly accused me and others of contradicting ourselves. I
don't think that's helpful in advancing the debate, and I don't think
it's what I'm doing.

The point I'm trying to make is that when failover happens, lots of
reconfiguration is going to be needed. There is just no getting
around that. Let's ignore synchronous replication entirely for a
moment. You're running 9.0 and you have 10 slaves. The master dies.
You promote a slave. Guess what? You need to look at each slave you
didn't promote and adjust primary_conninfo. You also need to check
whether the slave has received an xlog record with a higher LSN than
the one you promoted. If it has, you need to take a new base backup.
Otherwise, you may have data corruption - very possibly silent data
corruption.

Do you dispute this? If so, on which point?

The reason I think that we should centralize parameters on the master
is because they affect *the behavior of the master*. Controlling
whether the master will wait for the slave on the slave strikes me
(and others) as spooky action at a distance. Configuring whether the
master will retain WAL for a disconnected slave on the slave is
outright byzantine. Of course, configuring these parameters on the
master means that when the master changes, you're going to need a
configuration (possibly the same, possibly different) for said
parameters on the new master. But since you may be doing a lot of
other adjustment at that point anyway (e.g. new base backups, changes
in the set of synchronous slaves) that doesn't seem like a big deal.

> The chain of argument used to support this as being a sensible design choice is broken or contradictory in more than one
> place. I think we should be looking for a design using the KISS principle, while retaining sensible tuning options.

The KISS principle is exactly what I am attempting to apply.
Configuring parameters that affect the master on some machine other
than the master isn't KISS, to me. You may find that broken or
contradictory, but I disagree. I am attempting to disagree
respectfully, but statements like the above make me feel like you're
flaming, and that's getting under my skin.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

In response to

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message Aidan Van Dyk 2010-09-24 13:01:54 Re: Configuring synchronous replication
Previous Message Simon Riggs 2010-09-24 11:47:55 Re: Configuring synchronous replication

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2010-09-24 12:53:02 Re: Enable logging requires restart
Previous Message Thom Brown 2010-09-24 12:31:06 Re: Enable logging requires restart