Re: Synchronous replication patch built on SR

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: zb(at)cybertec(dot)at
Cc: pgsql-hackers(at)postgresql(dot)org, hs(at)cybertec(dot)at
Subject: Re: Synchronous replication patch built on SR
Date: 2010-04-30 20:57:32
Message-ID: 201004302057.o3UKvWj26902@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


Please add it to the next commit-fest:

https://commitfest.postgresql.org/action/commitfest_view/inprogress

---------------------------------------------------------------------------

zb(at)cybertec(dot)at wrote:
> Resending, my ISP lost my mail yesterday. :-(
>
> ===========================================================
>
> Hi,
>
> attached is a patch that does $SUBJECT, we are submitting it for 9.1.
> I have updated it to today's CVS after the "wal_level" GUC went in.
>
> How does it work?
>
> First, the walreceiver and the walsender are now able to communicate
> in a duplex way on the same connection, so while COPY OUT is
> in progress from the primary server, the standby server is able to
> issue PQputCopyData() to pass the transaction IDs that were seen
> with XLOG_XACT_COMMIT or XLOG_XACT_PREPARE
> signatures. I did by adding a new protocol message type, with letter
> 'x' that's only acknowledged by the walsender process. The regular
> backend was intentionally unchanged so an SQL client gets a protocol
> error. A new libpq call called PQsetDuplexCopy() which sends this
> new message before sending START_REPLICATION. The primary
> makes a note of it in the walsender process' entry.
>
> I had to move the TransactionIdLatest(xid, nchildren, children) call
> that computes latestXid earlier in RecordTransactionCommit(), so
> it's in the critical section now, just before the
> XLogInsert(RM_XACT_ID, XLOG_XACT_COMMIT, rdata)
> call. Otherwise, there was a race condition between the primary
> and the standby server, where the standby server might have seen
> the XLOG_XACT_COMMIT record for some XIDs before the
> transaction in the primary server marked itself waiting for this XID,
> resulting in stuck transactions.
>
> I have added 3 new options, two GUCs in postgresql.conf and one
> setting in recovery.conf. These options are:
>
> 1. min_sync_replication_clients = N
>
> where N is the number of reports for a given transaction before it's
> released as committed synchronously. 0 means completely asynchronous,
> the value is maximized by the value of max_wal_senders. Anything
> in between 0 and max_wal_senders means different levels of partially
> synchronous replication.
>
> 2. strict_sync_replication = boolean
>
> where the expected number of synchronous reports from standby
> servers is further limited to the actual number of connected synchronous
> standby servers if the value of this GUC is false. This means that if
> no standby servers are connected yet then the replication is asynchronous
> and transactions are allowed to finish without waiting for synchronous
> reports. If the value of this GUC is true, then transactions wait until
> enough synchronous standbys connect and report back.
>
> 3. synchronous_slave = boolean (in recovery.conf)
>
> this instructs the standby server to tell the primary that it's a
> synchronous
> replication server and it will send the committed XIDs back to the primary.
>
> I also added a contrib module for monitoring the synchronous replication
> but it abuses the procarray.c code by exposing the procArray pointer
> which is ugly. It's either need to be abandoned or moved to core if or when
> this code is discussed enough. :-)
>
> Best regards,
> Zolt?n B?sz?rm?nyi

[ Attachment, skipping... ]

>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2010-04-30 21:44:08 Re: HS - odd process listing
Previous Message Stefan Kaltenbrunner 2010-04-30 20:29:45 HS - odd process listing