Re: Streaming replication and a disk full in primary

From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Streaming replication and a disk full in primary
Date: 2010-01-21 14:10:39
Message-ID: 4B58605F.8090908@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Fujii Masao wrote:
> If the primary has a connected standby, the WAL files required for
> the standby cannot be deleted. So if it has fallen too far behind
> for some reasons, a disk full failure might occur on the primary.
> This is one of the problems that should be fixed for v9.0.
>
> We can cope with that case by carefully monitoring the standby lag.
> In addition to this, I think that we should put an upper limit on
> the number of WAL files held in pg_xlog for the standby (i.e.,
> the maximum delay of the standby) as a safeguard against a disk
> full error.
>
> The attached patch introduces new GUC 'replication_lag_segments'
> which specifies the maximum number of WAL files held in pg_xlog
> to send to the standby. The replication to the standby which
> falls more than the upper limit behind is automatically terminated,
> which would avoid a disk full erro on the primary.

Thanks!

I don't think we should do the check XLogWrite(). There's really no
reason to kill the standby connections before the next checkpoint, when
the old WAL files are recycled. XLogWrite() is in the critical path of
normal operations, too.

There's another important reason for that: If archiving is not working
for some reason, the standby can't obtain the old segments from the
archive either. If we refuse to stream such old segments, and they're
not getting archived, the standby has no way to catch up until archiving
is fixed. Allowing streaming of such old segments is free wrt. disk
space, because we're keeping the files around anyway.

Walreceiver will get an error if it tries to open a segment that's been
deleted or recycled already. The dangerous situation we need to avoid is
when walreceiver holds a file open while bgwriter recycles it.
Walreceiver will merrily continue streaming data from it, even though
it's be overwritten by new data already.

A straightforward fix is to keep an "newest recycled XLogRecPtr" in
shared memory that RemoveOldXlogFiles() updates. Walreceiver checks it
right after read()ing from a file, before sending it to the client, and
throws an error if the data it read() was already recycled.

Or you could do it entirely in walreceiver, by calling fstat() on the
open file instead of checking the variable in shared memory. If the
filename isn't what you expect, indicating that it's been recycled,
throw an error. But that needs an extra fstat() call for every read().

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2010-01-21 14:17:01 Re: lock_timeout GUC patch
Previous Message Dave Page 2010-01-21 14:10:20 Re: 8.5 vs. 9.0