Re: Streaming replication and a disk full in primary

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Streaming replication and a disk full in primary
Date: 2010-04-07 17:11:00
Message-ID: h2t603c8f071004071011mac3d845ahdc642368f6a01972@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Apr 7, 2010 at 6:02 AM, Heikki Linnakangas
<heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
> This task has been languishing for a long time, so I took a shot at it.
> I took the approach I suggested before, keeping a variable in shared
> memory to track the latest removed WAL segment. After walsender has read
> a bunch of WAL records from a WAL file, it checks that what it read is
> after the latest removed WAL segment, otherwise the data it read might
> have came from a file that was already recycled and overwritten with new
> data, and an error is thrown.
>
> This changes the behavior so that if a standby server doing streaming
> replication falls behind too much, the primary will remove/recycle a WAL
> segment needed by the standby server. The previous behavior was that WAL
> segments still needed by any connected standby server were never
> removed, at the risk of filling the disk in the primary if a standby
> server behaves badly.
>
> In your version of this patch, the default was still the current
> behavior where the primary retains WAL files that are still needed by
> connected stadby servers indefinitely. I think that's a dangerous
> default, so I changed it so that if you don't set standby_keep_segments,
> the primary doesn't retain any extra segments; the number of WAL
> segments available for standby servers is determined only by the
> location of the previous checkpoint, and the status of WAL archiving.
> That makes the code a bit simpler too, as we never care how far the
> walsenders are. In fact, the GetOldestWALSenderPointer() function is now
> dead code.

This seems like a very useful feature, but I can't speak to the code
quality without a good deal more study.

...Robert

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2010-04-07 17:41:33 Default libpq application name
Previous Message Magnus Hagander 2010-04-07 17:10:21 Re: Win32 timezone matching