Re: [patch] pg_copy - a command for reliable WAL archiving

From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: MauMau <maumau307(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [patch] pg_copy - a command for reliable WAL archiving
Date: 2014-08-15 14:46:31
Message-ID: CAHGQGwFXvnfPVhUS_9Y5HXqUrxxV4YeNN77NFOZc7-PLRGMxCA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Aug 14, 2014 at 1:52 PM, MauMau <maumau307(at)gmail(dot)com> wrote:
> I fixed some minor mistakes.

What's the main purpose of this tool? If it's for WAL archiving, the tool name
"pg_copy" sounds too generic. We already have pg_archivecleanup, so maybe
"pg_archivecopy" or something is better for the consistency?

pg_copy in the patch copies the file to the destination in a
straightforward way,
i.e., directly copies the file to the dest file with actual name. This can cause
the problem which some people reported. The problem is that, when the server
crashes while WAL file is being archived by cp command, its partially-filled
WAL file remains at the archival area. This half-baked archive file can cause
various troubles. To address this, WAL file needs to be copied to the temporary
file at first, then renamed to the actual name. I think that pg_copy should
copy the WAL file in that way.

Currently pg_copy always syncs the archive file, and there is no way to disable
that. But I'm sure that not everyone want to sync the archive file. So I think
that it's better to add the option specifying whether to sync the file
or not, into
pg_copy.

Some users might want to specify whether to call posix_fadvise or not because
they might need to re-read the archvied files just after the archiving.
For example, network copy of the archived files from the archive area to
remote site for disaster recovery.

Do you recommend to use pg_copy for restore_command? If yes, it also should
be documented. And in the WAL restore case, the restored WAL files are re-read
soon by recovery, so posix_fadvise is not good in that case.

Direct I/O and posix_fadvise are used only for destination file. But why not
source file? That might be useful especially for restore_command case.

At last, the big question is, is there really no OS command which provides
the same functionality as pg_copy does? If there is, I'd like to avoid duplicate
work basically.

Regards,

--
Fujii Masao

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2014-08-15 15:31:54 Re: pg_dump bug in 9.4beta2 and HEAD
Previous Message Kohei KaiGai 2014-08-15 14:26:41 Re: [v9.5] Custom Plan API