Re: [COMMITTERS] pgsql: Efficient transaction-controlled synchronous replication.

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [COMMITTERS] pgsql: Efficient transaction-controlled synchronous replication.
Date: 2011-03-18 14:10:03
Message-ID: AANLkTinRkoYcfrkMkzFsA3trGfwSxiqLDMw+PA9dicDt@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers pgsql-hackers

On Mon, Mar 7, 2011 at 3:44 AM, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
> On Mon, Mar 7, 2011 at 5:27 PM, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
>> On Mon, Mar 7, 2011 at 7:51 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>>> Efficient transaction-controlled synchronous replication.
>>> If a standby is broadcasting reply messages and we have named
>>> one or more standbys in synchronous_standby_names then allow
>>> users who set synchronous_replication to wait for commit, which
>>> then provides strict data integrity guarantees. Design avoids
>>> sending and receiving transaction state information so minimises
>>> bookkeeping overheads. We synchronize with the highest priority
>>> standby that is connected and ready to synchronize. Other standbys
>>> can be defined to takeover in case of standby failure.
>>>
>>> This version has very strict behaviour; more relaxed options
>>> may be added at a later date.
>>
>> Pretty cool! I'd appreciate very much your efforts and contributions.
>>
>> And,, I found one bug ;) You seem to have wrongly removed the check
>> of max_wal_senders in SyncRepWaitForLSN. This can make the
>> backend wait for replication even if max_wal_senders = 0. I could produce
>> this problematic situation in my machine. The attached patch fixes this problem.
>
>        if (strlen(SyncRepStandbyNames) > 0 && max_wal_senders == 0)
>                ereport(ERROR,
>                                (errmsg("Synchronous replication requires WAL streaming
> (max_wal_senders > 0)")));
>
> The above check should be required also after pg_ctl reload since
> synchronous_standby_names can be changed by SIGHUP?
> Or how about just removing that? If the patch I submitted is
> committed,empty synchronous_standby_names and max_wal_senders = 0
> settings is no longer unsafe.

This configuration is now harmless in the sense that it no longer
horribly breaks the entire system, but it's still pretty useless, so
this might be deemed a valuable sanity check. However, I'm reluctant
to leave it in there, because someone could change their config to
this state, pg_ctl reload, see everything working, and then later stop
the cluster and be unable to start it back up again. Since most
people don't shut their database systems down very often, they might
not discover that they have an invalid config until much later. I
think it's probably not a good idea to have configs that are valid on
reload but prevent startup, so I'm inclined to either remove this
check altogether or downgrade it to a warning.

As a side note, it's not very obvious why some parts of PostmasterMain
report problems by doing write_stderr() and exit() while other parts
use ereport(ERROR). This check and the nearby checks on WAL level are
immediately preceded and followed by other checks that use the
opposite technique.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message Simon Riggs 2011-03-18 14:19:18 Re: Re: [COMMITTERS] pgsql: Efficient transaction-controlled synchronous replication.
Previous Message Robert Haas 2011-03-18 13:46:21 pgsql: Remove ancient -X options to pg_dump, pg_dumpall, pg_restore.

Browse pgsql-hackers by date

  From Date Subject
Next Message hom 2011-03-18 14:10:29 Re: I am confused after reading codes of PostgreSQL three week
Previous Message Robert Haas 2011-03-18 13:46:38 Re: pg_dump -X