Re: PITR logging control program

From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)dcc(dot)uchile(dot)cl>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: PITR logging control program
Date: 2004-04-29 18:34:47
Message-ID: 1083263687.3100.7.camel@stromboli
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, 2004-04-29 at 15:22, Bruce Momjian wrote:
> Alvaro Herrera wrote:
> > On Thu, Apr 29, 2004 at 10:07:01AM -0400, Bruce Momjian wrote:
> > > Alvaro Herrera wrote:
> >
> > > > Is the API able to indicate a written but not-yet-filled WAL segment?
> > > > So an archiver could copy the filled part, and refill it later. This
> > > > may be needed because a segment could take a while to be filled.
> > >
> > > I couldn't figure that out, but I don't think it does. It would have to
> > > lock the WAL writes so it could get a good copy, I think, and I didn't
> > > see that.
> >
> > I'm not sure but I don't think so. You don't have to lock the WAL for
> > writing, because it will always write later in the file than you are
> > allowed to read. (If you read more than you were told to, it's your
> > fault as an archiver.)
>
> My point was that without locking the WAL, we might get part of a WAL
> write in our file, but I now realize that during a crash the same thing
> might happen, so it would be OK to just copy it even if it is being
> written to.
>
> Simon posted the rest of his patch that shows changes to the backend,
> and a comment reads:
>
> + * The name of the notification file is the message that will be picked up
> + * by the archiver, e.g. we write RLogDir/00000001000000C6.full
> + * and the archiver then knows to archive XLOgDir/00000001000000C6,
> + * while it is doing so it will rename RLogDir/00000001000000C6.full
> + * to RLogDir/00000001000000C6.busy, then when complete, rename it again
> + * to RLogDir/00000001000000C6.done
>
> so it is only archiving full logs.
>
> Also, I think this archiver should be able to log to a local drive,
> network drive (trivial), tape drive, ftp, or use an external script to
> transfer the logs somewhere. (ftp would probably be an external script
> with 'expect').

Bruce is correct, the API waits for the archive to be full before
archiving.

I had thought about the case for partial archiving: basically, if you
want to archive in smaller chunks, make your log files smaller...this is
now a compile time option. Possibly there is an argument to make the
xlog file size configurable, as a way of doing what you suggest.

Taking multiple copies of the same file, yet trying to work out which
one to apply sounds complex and error prone to me. It also increases the
cost of the archival process and thus drains other resources.

The archiver should be able to do a whole range of things. Basically,
that point was discussed and the agreed approach was to provide an API
that would allow anybody and everybody to write whatever they wanted.
The design included pg_arch since it was clear that there would be a
requirement in the basic product to have those facilities - and in any
case any practically focused API has a reference port as a way of
showing how to use it and exposing any bugs in the server side
implementation.

The point is...everybody is now empowered to write tape drive code,
whatever you fancy.... go do.

Best regards, Simon Riggs

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Private 2004-04-29 18:35:24 PostgreSQL pre-fork speedup
Previous Message Marc G. Fournier 2004-04-29 18:32:13 Re: Call for 7.5 feature completion