Re: Streaming replication status

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Streaming replication status
Date: 2010-01-10 11:13:35
Message-ID: 1263122015.19367.139042.camel@ebony
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, 2010-01-08 at 23:16 +0200, Heikki Linnakangas wrote:

> * I removed the feature that archiver was started during recovery. The
> idea of that was to enable archiving from a standby server, to relieve
> the master server of that duty, but I found it annoying because it
> causes trouble if the standby and master are configured to archive to
> the same location; they will fight over which copies the file to the
> archive first. Frankly the feature doesn't seem very useful as the patch
> stands, because you still have to configure archiving in the master in
> practice; you can't take an online base backup otherwise, and you have
> the risk of standby falling too much behind and having to restore from
> base backup whenever the standby is disconnected for any reason. Let's
> revisit this later when it's truly useful.

Agreed

> * We still have a related issue, though: if standby is configured to
> archive to the same location as master (as it always is on my laptop,
> where I use the postgresql.conf of the master unmodified in the server),
> right after failover the standby server will try to archive all the old
> WAL files that were streamed from the master; but they exist already in
> the archive, as the master archived them already. I'm not sure if this
> is a pilot error, or if we should do something in the server to tell
> apart WAL segments streamed from master and those generated in the
> standby server after failover. Maybe we should immediately create a
> .done file for every file received from master?

That sounds like the right thing to do.

> * I don't think we should require superuser rights for replication.
> Although you see all WAL and potentially all data in the system through
> that, a standby doesn't need any write access to the master, so it would
> be good practice to create a dedicated account with limited privileges
> for replication.

Agreed. I think we should have a predefined user, called "replication"
that has only the correct rights.

> * A standby that connects to master, initiates streaming, and then sits
> idle without stalls recycling of old WAL files in the master. That will
> eventually lead to a full disk in master. Do we need some kind of a
> emergency valve on that?

Can you explain how this could occur? My understanding was that the
walreceiver and startup processes were capable of independent action
specifically to avoid for this kind of effect.

> * Documentation. The patch used to move around some sections, but I
> think that has been partially reverted so that it now just duplicates
> them. It probably needs other work too, I haven't looked at the docs in
> any detail.

I believe the docs need urgent attention. We need more people to read
the docs and understand the implications so that people can then
comment. It is extremely non-obvious from the patch how things work at a
behaviour level.

I am very concerned that there is no thought given to monitoring
replication. This will make the feature difficult to use in practice.

--
Simon Riggs www.2ndQuadrant.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2010-01-10 11:17:41 Re: Streaming replication status
Previous Message Stefan Kaltenbrunner 2010-01-10 11:09:05 Re: We need to rethink relation cache entry rebuild