Re: Understanding PG9.0 streaming replication feature

From: Dan Birken <dan(at)thumbtack(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Ben Carbery <ben(dot)carbery(at)gmail(dot)com>, pgsql-general(at)postgresql(dot)org
Subject: Re: Understanding PG9.0 streaming replication feature
Date: 2011-01-26 22:22:41
Message-ID: AANLkTine9HZ_y9x5W=RfnsUx6ZrQWzOBnj8R_ss7URvL@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

(I am not the OP, but recently went through the same thing so I'll chime in)

Reading through the documentation now (albeit with a now pretty good
understanding of how everything works), I think the main confusing thing is
how different bits which apply to file-base log shipping, streaming
replication and both of them are thrown together on this
page<http://developer.postgresql.org/pgdocs/postgres/warm-standby.html>,
making it difficult to figure out what you need to know if you are just
looking to implement streaming replication.

For example, in the introduction section:

Directly moving WAL records from one database server to another is typically
described as log shipping. PostgreSQL implements file-based log shipping,
which means that WAL records are transferred one file (WAL segment) at a
time. WAL files (16MB) can be shipped easily and cheaply over any distance,
whether it be to an adjacent system, another system at the same site, or
another system on the far side of the globe. The bandwidth required for this
technique varies according to the transaction rate of the primary
server. Record-based
log shipping is also possible with streaming replication (see Section
25.2.5<http://developer.postgresql.org/pgdocs/postgres/warm-standby.html#STREAMING-REPLICATION>
).

It should be noted that the log shipping is asynchronous, i.e., the WAL
records are shipped after transaction commit. As a result, there is a window
for data loss should the primary server suffer a catastrophic failure;
transactions not yet shipped will be lost. The size of the data loss window
in file-based log shipping can be limited by use of the
archive_timeout parameter,
which can be set as low as a few seconds. However such a low setting will
substantially increase the bandwidth required for file shipping. If you need
a window of less than a minute or so, consider using streaming replication
(see Section 25.2.5<http://developer.postgresql.org/pgdocs/postgres/warm-standby.html#STREAMING-REPLICATION>
).

I colored things that apply to both in purple, that apply just to file-based
log shipping in red, and that just apply to streaming replication in green.
So if you are reading through this for the first time looking for
information on streaming replication, it is very difficult to figure out
some key points (it works by log-shipping, it is asynchronous), while
avoiding stuff that you don't need to worry about (archive_timeout, WAL
files are transferred one at a time, etc).

I doubt I am the first person that is using postgres replication for the
first time because of hot standbys and streaming replication, and I think
the document is very poor for dealing with those people. Just looking at
the coloring above, it looks very clearly like the document was written for
file-based log shipping and then details about streaming replication are
just appended at the end.

The great thing about the wiki
page<http://wiki.postgresql.org/wiki/Streaming_Replication> (which
I am assuming is the doc OP is referring to positively) is that it only
includes details about streaming replication, thus you don't have to
constantly be dodging information that doesn't apply to you.
-Dan

On Wed, Jan 26, 2011 at 7:04 AM, Bruce Momjian <bruce(at)momjian(dot)us> wrote:

> Ben Carbery wrote:
> > Thanks for the responses all, I have this working now. I had to create a
> > base backup before copying to the standby for replication to start, but
> the
> > main sticking point was actually understanding the terms and concepts
> > involved..
> >
> > I think the Binary Replication Tutorial page on the wiki basically
> explains
> > everything. Unfortunately the actual pg manual is still about as clear as
> > mud even though I now have a vague idea of how this all works. I think
> this
> > is worth mentioning given the majority of the pg manual is actually of an
> > unusually high standard - probably among the best technical manuals I
> have
> > read in terms of being both comprehensive and concise, so it's a shame
> that
> > this section doesn't meet that standard (IMO). Hopefully this will get a
> > rewrite at some point!
>
> Can you give some concrete suggestions on what needs to be added? The
> current documentation is here:
>
> http://developer.postgresql.org/pgdocs/postgres/index.html
>
> --
> Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
> EnterpriseDB http://enterprisedb.com
>
> + It's impossible for everything to be true. +
>
> --
> Sent via pgsql-general mailing list (pgsql-general(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Guillaume Lelarge 2011-01-26 22:41:24 Re: Adding ddl audit trigger
Previous Message Tom Lane 2011-01-26 22:13:38 Re: Adding ddl audit trigger