From: | Andres Freund <andres(at)2ndquadrant(dot)com> |
---|---|
To: | Bruce Momjian <bruce(at)momjian(dot)us> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, MauMau <maumau307(at)gmail(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: [patch] pg_copy - a command for reliable WAL archiving |
Date: | 2014-08-20 23:10:53 |
Message-ID: | 20140820231053.GA9835@alap3.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 2014-08-20 18:58:05 -0400, Bruce Momjian wrote:
> On Wed, Aug 20, 2014 at 10:36:40AM -0400, Tom Lane wrote:
> > Andres Freund <andres(at)2ndquadrant(dot)com> writes:
> > > On 2014-08-20 10:19:33 -0400, Tom Lane wrote:
> > >> Alternatively, you could use the process PID as part of the temp file
> > >> name; which is probably a good idea anyway.
> >
> > > I think that's actually worse, because nothing will clean up those
> > > unless you explicitly scan for all <whatever>.$pid files, and somehow
> > > kill them.
> >
> > True. As long as the copy command is prepared to get rid of a
> > pre-existing target file, using a fixed .tmp extension should be fine.
>
> Well, then we are back to this comment by MauMau:
> > With that said, copying to a temporary file like <dest>.tmp and
> > renaming it to <dest> sounds worthwhile even as a basic copy utility.
> > I want to avoid copying to a temporary file with a fixed name like
> > _copy.tmp, because some advanced utility may want to run multiple
> > instances of pg_copy to copy several files into the same directory
> > simultaneously. However, I'm afraid multiple <dest>.tmp files might
> > continue to occupy disk space after canceling copy or power failure in
> > some use cases, where the copy of the same file won't be retried.
> > That's also the reason why I chose to not use a temporary file like
> > cp/copy.
>
> Do we want cases where the same directory is used multiple pg_copy
> processes? I can't imagine how that setup would make sense.
I don't think anybody is proposing the _copy.tmp proposal. We've just
argued about the risk of <dest>.tmp. And I argued - and others seem to
agree - the space usage problem isn't really relevant because archive
commands and such are rerun after failure and can then clean up the temp
file again.
> I am thinking pg_copy should emit a warning message when it removes an
> old temp file. This might alert people that something odd is happening
> if they see the message often.
Don't really see a point in this. If the archive command or such failed,
that will already have been logged. I'd expect this to be implemented by
passing O_CREAT | O_TRUNC to open(), nothing else.
> The pid-extension idea would work as pg_copy can test all pid extension
> files to see if the pid is still active. However, that assumes that the
> pid is running on the local machine and not on another machines that has
> NFS-mounted this directory, so maybe this is a bad idea, but again, we
> are back to the idea that only one process should be writing into this
> directory.
I don't actually think we should assume that. There very well could be
one process running an archive command, using differently prefixed file
names or such.
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Bruce Momjian | 2014-08-20 23:24:20 | Re: [PATCH] Incremental backup: add backup profile to base backup |
Previous Message | Alvaro Herrera | 2014-08-20 23:10:40 | Re: Minmax indexes |