Re: Replication documentation addition

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: PostgreSQL-documentation <pgsql-docs(at)postgresql(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Replication documentation addition
Date: 2006-10-24 03:55:31
Message-ID: 200610240355.k9O3tVF20288@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-docs pgsql-hackers


Please disregard. I am redoing it and will post a URL with the most
recent version.

---------------------------------------------------------------------------

Bruce Momjian wrote:
>
> Here is my first draft of a new replication section for our
> documentation. I am looking for any comments.
>
> ---------------------------------------------------------------------------
>
> Replication
> ===========
>
> Database replication allows multiple computers to work together, making
> them appear as a single computer to user applications. This might
> involve allowing a backup server to take over if the primary server
> fails, or it might involve allowing several computers to work together
> at the same time.
>
> It would be ideal if database servers could be combined seamlessly. Web
> servers serving static web pages can be combined quite easily by merely
> load-balancing web requests to multiple machines. In fact, most
> read-only servers can be combined relatively easily.
>
> Unfortunately, most database servers have a read/write mix of requests,
> and read/write servers are much harder to combine. This is because
> though read-only data has to be placed on each each server only once, a
> write to any server has to be seen by all other servers so that future
> read requests to those servers return consistent results.
>
> This "sync problem" is the fundamental difficulty of doing database
> replication. Because there is no single solution that limits the impact
> of the sync problem for all workloads, there are multiple replication
> solutions. Each solution addresses the sync problem in a different way,
> and minimizes its impact for a specific workload.
>
> This section first outlines two important replication capabilities, and
> then outlines various replication solutions.
>
> Sychronous vs. Asynchronous Replication
> ---------------------------------------
>
> The term sychronous replication means that a query is not considered
> committed unless all servers have access to the committed records. In
> that case, a failover to a backup server will lose no data records.
> Asynchronous replication has a small delay between the time of commit
> and its propogation to backup servers, opening the possibility that some
> transactions might be lost in a switch to a backup server. Asynchronous
> is used when sychronous replication would be too slow.
>
> Full vs. Partial Replication
> ----------------------------
>
> The term full replication means only a full database cluster can be
> replicated, while partial replication means more fine-grained control
> over replicated objects is possible.
>
> Shared Disk Failover
> --------------------
>
> This replication solution avoids the sync problem by having only one
> copy of the database. This is possible because a single disk array is
> shared by multiple servers. If the main database server fails, the
> backup server is able to mount and start the database as though it was
> restarting after a database crash. This shared hardware functionality
> is common in network storage devices. This allows sychronous, full
> replication.
>
> Warm Standby Using Point-In-Time Recovery
> -----------------------------------------
>
> A warm standby server (add doc xref) can be kept current by reading a
> stream of WAL records. If the main server fails, the warm standby
> contains almost all of the data as the main server, and can be used as
> the new database server. This allows asychronous, full replication.
>
> Point-In-Time Recovery [Asychronous, Full]
> ----------------------
>
> A Point-In-Time Recovery is the same as a Warm Standby server except
> that the standby server must go though a full restore and archive
> recovery operation, delaying how quickly it can be used as the main
> database server. This allows asychronous, full replication.
>
> Continuously Running Failover Server
> ------------------------------------
>
> A continuously running failover server allows the backup server to
> answer read-only queries while the master server is running. It
> receives a continuous stream of write activity from the master server.
> Because the failover server can be used for read-only database requests,
> it is ideal for data warehouse queries. Slony offers this as
> asychronous, partial replication.
>
> Data Partitioning
> -----------------
>
> Data partitioning partitions the database into data sets. To achieve
> replication, each data set can only be modified by one server. For
> example, data can be partitioned by main office, e.g. London and Paris.
> While London and Paris servers have all data records, only London can
> modify London records, and Paris can only modify Paris records. Such
> partitioning is usually accomplished in application code, though rules
> and triggers can help enforce such partitioning and keep the read-only
> data sets current. Slony can also be used in such a setup. While Slony
> replicates only entire tables, London and Paris can be placed in
> separate tables, and inheritance can be used to pull from both tables at
> the same time.
>
> Query Broadcast Replication
> ---------------------------
>
> This involves sending write queries to multiple servers. Read-only
> queries can be sent to a single server because there is no need for all
> servers to process it. This can be complex to setup because functions
> like random() and CURRENT_TIMESTAMP will have different values on
> different servers, and sequences should be consistent across servers.
> Pgpool implements this type of replication.
>
> Multi-Master Replication
> ------------------------
>
> In multi-master replication, each server can accept write requests, and
> these write requests are broadcast to all other servers before the
> transaction commits. Under heavy load, this type of replication can
> cause excessive locking and performance degradation. It is implemented
> by Oracle in their RAC product. PostgreSQL does not offer this type of
> replication, though PostgreSQL two-phase commit can be used to implement
> this in application code.
>
> Performance
> -----------
> Performance must be considered in any repliacation choice. There is
> usually a tradeoff between functionality and performance. For example,
> full sychronousreplication over a slow network might cut performance by
> more than half, while asynchronous replication might have a minimal
> performance imact.
>
> --
> Bruce Momjian bruce(at)momjian(dot)us
> EnterpriseDB http://www.enterprisedb.com
>
> + If your life is a hard drive, Christ can be your backup. +
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: don't forget to increase your free space map settings

--
Bruce Momjian bruce(at)momjian(dot)us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

In response to

Browse pgsql-docs by date

  From Date Subject
Next Message Bruce Momjian 2006-10-24 04:20:40 Replication documentation addition
Previous Message Bruce Momjian 2006-10-24 03:39:34 Replication documentation addition

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2006-10-24 04:20:40 Replication documentation addition
Previous Message Bruce Momjian 2006-10-24 03:39:34 Replication documentation addition