Re: PITR logging control program

From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Alvaro Herrera <alvherre(at)dcc(dot)uchile(dot)cl>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: PITR logging control program
Date: 2004-04-29 19:24:40
Message-ID: 200404291924.i3TJOeG29969@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Simon Riggs wrote:
> > Also, I think this archiver should be able to log to a local drive,
> > network drive (trivial), tape drive, ftp, or use an external script to
> > transfer the logs somewhere. (ftp would probably be an external script
> > with 'expect').
>
> Bruce is correct, the API waits for the archive to be full before
> archiving.
>
> I had thought about the case for partial archiving: basically, if you
> want to archive in smaller chunks, make your log files smaller...this is
> now a compile time option. Possibly there is an argument to make the
> xlog file size configurable, as a way of doing what you suggest.
>
> Taking multiple copies of the same file, yet trying to work out which
> one to apply sounds complex and error prone to me. It also increases the
> cost of the archival process and thus drains other resources.
>
> The archiver should be able to do a whole range of things. Basically,
> that point was discussed and the agreed approach was to provide an API
> that would allow anybody and everybody to write whatever they wanted.
> The design included pg_arch since it was clear that there would be a
> requirement in the basic product to have those facilities - and in any
> case any practically focused API has a reference port as a way of
> showing how to use it and exposing any bugs in the server side
> implementation.
>
> The point is...everybody is now empowered to write tape drive code,
> whatever you fancy.... go do.

Agreed we want to allow the superuser control over writing of the
archive logs. The question is how do they get access to that. Is it by
running a client program continuously or calling an interface script
from the backend?

My point was that having the backend call the program has improved
reliablity and control over when to write, and easier administration.

How are people going to run pg_arch? Via nohup? In virtual screens? If
I am at the console and I want to start it, do I use "&"? If I want to
stop it, do I do a 'ps' and issue a 'kill'? This doesn't seem like a
good user interface to me.

To me the problem isn't pg_arch itself but the idea that a client
program is going to be independently finding(polling) and copying of the
archive logs.

I am thinking the client program is called with two arguments, the xlog
file name, and the arch location defined in GUC. Then the client
program does the write. The problem there though is who gets the write
error since the backend will not wait around for completion?

Another case is server start/stop. You want to start/stop the archive
logger to match the database server, particularly if you reboot the
server. I know Informix used a client program for logging, and it was a
pain to administer.

I would be happy with an exteral program if it was started/stoped by the
postmaster (or via GUC change) and received a signal when a WAL file was
written. But if we do that, it isn't really an external program anymore
but another child process like our stats collector.

I am willing to work on this if folks think this is a better approach.

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2004-04-29 19:26:05 Re: What can we learn from MySQL?
Previous Message Andrew Sullivan 2004-04-29 19:17:00 Re: Call for 7.5 feature completion