Re: [RFC] What should we do for reliable WAL archiving?

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: MauMau <maumau307(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [RFC] What should we do for reliable WAL archiving?
Date: 2014-03-29 21:38:03
Message-ID: CAMkU=1wM5CvcTQB2DXnt1v_NcmT0e=aiWQCZJ7+Zci4gB-HovQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Mar 21, 2014 at 2:22 PM, MauMau <maumau307(at)gmail(dot)com> wrote:

> From: "Jeff Janes" <jeff(dot)janes(at)gmail(dot)com>
>
> Do people really just copy the files from one directory of local storage
>> to
>> another directory of local storage? I don't see the point of that.
>>
>
> It makes sense to archive WAL to a directory of local storage for media
> recovery. Here, the local storage is a different disk drive which is
> directly attached to the database server or directly connected through SAN.

For a SAN I guess we have different meanings of "local" :)
(I have no doubt yours is correct--the fine art of IT terminology is not my
thing.)

The recommendation is to refuse to overwrite an existing file of the same
>> name, and exit with failure. Which essentially brings archiving to a
>> halt,
>> because it keeps trying but it will keep failing. If we make a custom
>> version, one thing it should do is determine if the existing archived file
>> is just a truncated version of the attempting-to-be archived file, and if
>> so overwrite it. Because if the first archival command fails with a
>> network glitch, it can leave behind a partial file.
>>
>
> What I'm trying to address is just an alternative to cp/copy which fsyncs
> a file. It just overwrites an existing file.
>
> Yes, you're right, the failed archive attempt leaves behind a partial file
> which causes subsequent attempts to fail, if you follow the PG manual.
> That's another undesirable point in the current doc. To overcome this,
> someone on this ML recommended me to do "cp %p /archive/dir/%f.tmp && mv
> /archive/dir/%f.tmp /archive/dir/%f". Does this solve your problem?
>

As written is doesn't solve it, as it just unconditionally overwrites the
file. If you wanted that you could just do the single-statement
unconditional overwrite.

You could make it so that the .tmp gets overwritten unconditionally, but
the move of it will not overwrite an existing permanent file. That would
solve the problem where a glitch in the network leaves in incomplete file
behind that blocks the next attempt, *except* that mv on (at least some)
network file systems is really a copy, and not an atomic rename, so is
still subject to leaving behind incomplete crud.

But, it is hard to tell what the real solution is, because the doc doesn't
explain why it should refuse (and fail) to overwrite an existing file. The
only reason I can think of to make that recommendation is because it is
easy to accidentally configure two clusters to attempt to archive to the
same location, and having them overwrite each others files should be
guarded against. If I am right, it seems like this reason should be added
to the docs, so people know what they are defending against. And if I am
wrong, it seems even more important that the (correct) reason is added to
the docs.

Cheers,

Jeff

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2014-03-29 21:39:11 Re: pgsql: Revert "Secure Unix-domain sockets of "make check" temporary clu
Previous Message Andrew Dunstan 2014-03-29 21:21:11 Re: pgsql: Revert "Secure Unix-domain sockets of "make check" temporary clu