Quick Links

Re: Streaming Replication: Observations, Questions and Comments

From:	Samba <saasira(at)gmail(dot)com>
To:	Alan Hodgson <ahodgson(at)simkin(dot)ca>
Cc:	pgsql-general(at)postgresql(dot)org
Subject:	Re: Streaming Replication: Observations, Questions and Comments
Date:	2011-08-25 13:29:14
Message-ID:	CAKgWO9LMrmoDc=uPNOhJhZUmS7=PMeQzcJ58YOWSUSgz5PUxEw@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

The problem with maintaining a separate archive is that one need to write
some additional scripts to periodically remove older log files from the
archive and that gets complicated with a setup having one master and
multiple slaves.

I think it is a better idea to club compression and clean up in the core
itself, may at a later release. A better approach to cleanup is that the
walsender process decides when to cleanup a particular logfile based on the
feedback from the all the registered slaves. If a slave is not reachable or
falls behind for too long, then that slave should be banned from the setup
(log the event in pg_replication.log ???). The replication status for each
slave can be maintained in something like pg_slave_replica_status catalog
table.

When it comes to compression, walsender can compress the each chunk of data
that it streams (increasing the streaming_delay may improve compression
ratio, hence a balance has to be struck between compression and
sustainable-data-loss-in-case-of-failure)

Although I could visualise this design would be much better than leaving it
to external utilities, I'm not that good at C language and hence only
proposing a design and not a patch. I hope my suggestion will be received in
good spirit.

Thanks and Regards,
Samba

PS:
I have wrongly stated that master server had to be restarted in case of long
disconnects, sorry that was not true. But I still feel that requiring
restart of standby server to resume replication should be avoided, if
possible.

And, I strongly feel that a breakage in replication must be logged by both
master server and the concerned slave servers.

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
On Wed, Aug 24, 2011 at 11:03 PM, Alan Hodgson <ahodgson(at)simkin(dot)ca> wrote:

> On August 24, 2011 08:33:17 AM Samba wrote:
> > One strange thing I noticed is that the pg_xlogs on the master have
> > outsized the actual data stored in the database by at least 3-4 times,
> > which was quite surprising. I'm not sure if 'restore_command' has
> anything
> > to do with it. I did not understand why transaction logs would need to be
> > so many times larger than the actual size of the database, have I done
> > something wrong somewhere?
>
> If you archive them instead of keeping them in pg_xlog, you can gzip them.
> They compress reasonably well.
>
> --
> Sent via pgsql-general mailing list (pgsql-general(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

In response to

Re: Streaming Replication: Observations, Questions and Comments at 2011-08-24 17:33:59 from Alan Hodgson

Browse pgsql-general by date

	From	Date	Subject
Next Message	Tom Lane	2011-08-25 14:14:05	Re: Sort Method: external merge
Previous Message	Massa, Harald Armin	2011-08-25 12:05:00	Re: documentation for hashtext?