Re: Unarchived WALs deleted after crash

From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc: Jehan-Guillaume de Rorthais <jgdr(at)dalibo(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Simon Riggs <simon(at)2ndquadrant(dot)com>
Subject: Re: Unarchived WALs deleted after crash
Date: 2013-02-15 17:43:08
Message-ID: 511E73AC.2070907@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 15.02.2013 19:16, Fujii Masao wrote:
> On Sat, Feb 16, 2013 at 2:07 AM, Heikki Linnakangas
> <hlinnakangas(at)vmware(dot)com> wrote:
>> On 15.02.2013 18:10, Fujii Masao wrote:
>>>
>>> At least in 9.2, when the archived file is restored into pg_xlog, its
>>> xxx.done
>>> archive status file is created. So we don't need to check
>>> InArchiveRecovery
>>> when deleting old WAL files. Checking whether xxx.done exists is enough.
>>
>> Hmm, what about streamed WAL files? I guess we could go back to the pre-9.2
>> coding, and check WalRcvInProgress(). But I didn't actually like that too
>> much, it seems rather random that old streamed files are recycled when wal
>> receiver is running at the time of restartpoint, and otherwise not. Because
>> whether wal receiver is running at the time the restartpoint happens has
>> little to do with which files were created by streaming replication. With
>> the right pattern of streaming files from the master, but always being
>> teporarily disconnected when the restartpoint runs, you could still
>> accumulate WAL files infinitely.
>
> Walreceiver always creates .done file when it closes the
> already-flushed WAL file
> and switches WAL file to next. So we also don't need to check
> WalRcvInProgress().

Ah, I missed that part of the patch.

Okay, agreed, that's a better fix. I committed your forward-port of the
9.2 patch to master, reverted my earlier fix for this bug, and simply
removed the
InArchiveRecovery/ArchiveRecoveryInProgress()/RecoveryInProgress()
condition from RemoveOldXlogFiles().

- Heikki

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2013-02-15 17:43:12 Re: archive_timeout behaviour when archive_mode is off (was Re: Too frequent checkpoints ?)
Previous Message Josh Berkus 2013-02-15 17:34:11 Re: [HACKERS] Call for Google Summer of Code mentors, admins