Re: Review: Patch for Synchronous Replication

From: Thom Brown <thom(at)linux(dot)com>
To: pgsql-rrreviewers(at)postgresql(dot)org
Cc: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Subject: Re: Review: Patch for Synchronous Replication
Date: 2010-09-29 23:29:03
Message-ID: AANLkTikTGQm99Wv-t7K8rNms2NUpaLBoG=Avq-c3Jc1F@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-rrreviewers

On 29 September 2010 23:57, Thom Brown <thom(at)linux(dot)com> wrote:
> On 29 September 2010 23:29, Thom Brown <thom(at)linux(dot)com> wrote:
>> This is a basic review of Fujii Masao's synchronous replication patch
>> from http://archives.postgresql.org/message-id/AANLkTik2c3kV7HgJnM4MjkCWVG-QvDJXD3iR9TqsCnpP@mail.gmail.com
>>
>> Review Description
>> ==================
>> This patch extends existing asynchronous streaming replication with
>> options to enable different levels of synchronous behaviour.  On the
>> primary, the is a new standbys.conf file containing a manifest of
>> standbys it seeks a form a confirmation from before committing
>> transactions.  These contain standby name, level of synchronicity (if
>> it wasn't a word, it is now), and a timeout value specifying how long
>> the primary is willing to wait for confirmation before aborting the
>> transaction.  On the standby, the new option "standby_name" has been
>> added to publish a named identity of the standby server to the
>> primary.
>>
>>
>> Patch application
>> =================
>> The patch applies cleanly to HEAD.  All regression tests pass
>> successfully as expected.
>>
>>
>> Testing
>> =======
>> I configured the primary's standby.conf to provide synchronous
>> replication to a single slave using fsync as it's replication level
>> and a timeout of 100ms.  The wal_level was set to hot_standby in
>> postgresql.conf.  The standby's recovery.conf had its standby_name
>> value set to match that expected by the primary's standby.conf.  This
>> is summarised as follows:
>>
>> ## primary postgresql.conf:
>> wal_level = hot_standby
>> max_wal_senders = 2
>> wal_keep_segments = 2
>>
>> ## primary standbys.conf
>> # STANDBY NAME    SYNCHRONOUS   TIMEOUT
>> cougar            fsync           100ms
>>
>> ## primary pg_hba.conf
>> # TYPE  DATABASE        USER            CIDR-ADDRESS            METHOD
>> host    replication     postgres        192.168.102.17/32       trust
>>
>> ## standby postgresql.conf
>> hot_standby = on
>>
>> ## standby recovery.conf
>> standby_name = 'cougar'
>> standby_mode = 'on'
>> primary_conninfo = 'host=192.168.102.125 port=5432'
>>
>>
>> Issues
>> ======
>>
>> The primary started up fine and was accepting connections.  A base
>> backup was taken for the standby, and after configuring the standby, I
>> attempted to bring it up, but received the following error:
>>
>> postgres(at)cougar:~/project/data$ pg_ctl start
>> server starting
>> postgres(at)cougar:~/project/data$ LOG:  database system was shut down in
>> recovery at 2010-09-29 22:52:24 BST
>> LOG:  entering standby mode
>> LOG:  redo starts at 0/1000020
>> LOG:  record with zero length at 0/10000B0
>> FATAL:  could not connect to the primary server: invalid connection
>> option "standby_name"
>>
>> I believe I am using the correct parameter as I followed the
>> additional documentation provided in the patch.
>>
>>
>> Conclusion
>> ==========
>>
>> It at least appears to me that it isn't functional in its current
>> state, or there is setup information missing from the
>> documentation.... or I've make a stupid mistake somewhere (likely).
>
> Quick back-peddle... it appears the patch was only successful on the
> standby.  Only doc changes appeared to make it to the primary.
> Re-attempt tomorrow.  Apologies.

Well, that doesn't seem to have made any difference. Confirmed the
patch was applied in both cases, rebuilt, base backup again etc... no
change. Same error as in original review. *shrug*

--
Thom Brown
Twitter: @darkixion
IRC (freenode): dark_ixion
Registered Linux user: #516935

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Euler Taveira de Oliveira 2010-09-30 00:17:59 external pid file
Previous Message Tatsuo Ishii 2010-09-29 23:09:08 Re: Unable to generate man pages for translated sgml

Browse pgsql-rrreviewers by date

  From Date Subject
Next Message Robert Haas 2010-09-30 00:38:59 Re: Review: Patch for Synchronous Replication
Previous Message Thom Brown 2010-09-29 22:57:33 Re: Review: Patch for Synchronous Replication