Skip site navigation (1) Skip section navigation (2)

Re: Replication documentation addition

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: PostgreSQL-documentation <pgsql-docs(at)postgresql(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Replication documentation addition
Date: 2006-10-24 03:55:31
Message-ID: 200610240355.k9O3tVF20288@momjian.us (view raw or flat)
Thread:
Lists: pgsql-docspgsql-hackers
Please disregard.  I am redoing it and will post a URL with the most
recent version.

---------------------------------------------------------------------------

Bruce Momjian wrote:
> 
> Here is my first draft of a new replication section for our
> documentation.  I am looking for any comments.
> 
> ---------------------------------------------------------------------------
> 
> Replication
> ===========
> 
> Database replication allows multiple computers to work together, making
> them appear as a single computer to user applications.  This might
> involve allowing a backup server to take over if the primary server
> fails, or it might involve allowing several computers to work together
> at the same time.
> 
> It would be ideal if database servers could be combined seamlessly.  Web
> servers serving static web pages can be combined quite easily by merely
> load-balancing web requests to multiple machines.  In fact, most
> read-only servers can be combined relatively easily.
> 
> Unfortunately, most database servers have a read/write mix of requests,
> and read/write servers are much harder to combine.  This is because
> though read-only data has to be placed on each each server only once, a
> write to any server has to be seen by all other servers so that future
> read requests to those servers return consistent results.  
> 
> This "sync problem" is the fundamental difficulty of doing database
> replication.  Because there is no single solution that limits the impact
> of the sync problem for all workloads, there are multiple replication
> solutions.  Each solution addresses the sync problem in a different way,
> and minimizes its impact for a specific workload.  
> 
> This section first outlines two important replication capabilities, and
> then outlines various replication solutions.
> 
> Sychronous vs. Asynchronous Replication
> ---------------------------------------
> 
> The term sychronous replication means that a query is not considered
> committed unless all servers have access to the committed records.  In
> that case, a failover to a backup server will lose no data records. 
> Asynchronous replication has a small delay between the time of commit
> and its propogation to backup servers, opening the possibility that some
> transactions might be lost in a switch to a backup server.  Asynchronous
> is used when sychronous replication would be too slow.
> 
> Full vs. Partial Replication
> ----------------------------
> 
> The term full replication means only a full database cluster can be
> replicated, while partial replication means more fine-grained control
> over replicated objects is possible.
> 
> Shared Disk Failover
> -------------------- 
> 
> This replication solution avoids the sync problem by having only one
> copy of the database.  This is possible because a single disk array is
> shared by multiple servers.  If the main database server fails, the
> backup server is able to mount and start the database as though it was
> restarting after a database crash.  This shared hardware functionality
> is common in network storage devices.  This allows sychronous, full
> replication.
> 
> Warm Standby Using Point-In-Time Recovery
> -----------------------------------------
> 
> A warm standby server (add doc xref) can be kept current by reading a
> stream of WAL records.  If the main server fails, the warm standby
> contains almost all of the data as the main server, and can be used as
> the new database server.  This allows asychronous, full replication.
> 
> Point-In-Time Recovery  [Asychronous, Full]
> ----------------------
> 
> A Point-In-Time Recovery is the same as a Warm Standby server except
> that the standby server must go though a full restore and archive
> recovery operation, delaying how quickly it can be used as the main
> database server.  This allows asychronous, full replication.
> 
> Continuously Running Failover Server
> ------------------------------------
> 
> A continuously running failover server allows the backup server to
> answer read-only queries while the master server is running.  It
> receives a continuous stream of write activity from the master server. 
> Because the failover server can be used for read-only database requests,
> it is ideal for data warehouse queries. Slony offers this as
> asychronous, partial replication.
> 
> Data Partitioning
> -----------------
> 
> Data partitioning partitions the database into data sets.  To achieve
> replication, each data set can only be modified by one server.  For
> example, data can be partitioned by main office, e.g. London and Paris. 
> While London and Paris servers have all data records, only London can
> modify London records, and Paris can only modify Paris records.  Such
> partitioning is usually accomplished in application code, though rules
> and triggers can help enforce such partitioning and keep the read-only
> data sets current.  Slony can also be used in such a setup.  While Slony
> replicates only entire tables, London and Paris can be placed in
> separate tables, and inheritance can be used to pull from both tables at
> the same time.
> 
> Query Broadcast Replication
> ---------------------------
> 
> This involves sending write queries to multiple servers.  Read-only
> queries can be sent to a single server because there is no need for all
> servers to process it.   This can be complex to setup because functions
> like random() and CURRENT_TIMESTAMP will have different values on
> different servers, and sequences should be consistent across servers.
> Pgpool implements this type of replication.
> 
> Multi-Master Replication
> ------------------------
> 
> In multi-master replication, each server can accept write requests, and
> these write requests are broadcast to all other servers before the
> transaction commits.  Under heavy load, this type of replication can
> cause excessive locking and performance degradation.  It is implemented
> by Oracle in their RAC product.  PostgreSQL does not offer this type of
> replication, though PostgreSQL two-phase commit can be used to implement
> this in application code.
> 
> Performance
> -----------
> Performance must be considered in any repliacation choice.  There is
> usually a tradeoff between functionality and performance.  For example,
> full sychronousreplication over a slow network might cut performance by
> more than half, while asynchronous replication might have a minimal
> performance imact.
> 
> -- 
>   Bruce Momjian   bruce(at)momjian(dot)us
>   EnterpriseDB    http://www.enterprisedb.com
> 
>   + If your life is a hard drive, Christ can be your backup. +
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 5: don't forget to increase your free space map settings

-- 
  Bruce Momjian   bruce(at)momjian(dot)us
  EnterpriseDB    http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

In response to

pgsql-docs by date

Next:From: Bruce MomjianDate: 2006-10-24 04:20:40
Subject: Replication documentation addition
Previous:From: Bruce MomjianDate: 2006-10-24 03:39:34
Subject: Replication documentation addition

pgsql-hackers by date

Next:From: Bruce MomjianDate: 2006-10-24 04:20:40
Subject: Replication documentation addition
Previous:From: Bruce MomjianDate: 2006-10-24 03:39:34
Subject: Replication documentation addition

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group