PITR logging control program

From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: PITR logging control program
Date: 2004-04-29 04:18:38
Message-ID: 200404290418.i3T4Icw23759@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Simon Riggs wrote:
> > When the server is not running there is nothing to archive, so I don't follow
> > this argument.
>
> The running server creates xlogs, which are still available for archive
> even when the server is not running...
>
> Overall, your point is taken, with many additional comments in my other
> posts in reply to you.
>
> I accept that this may be desirable in the future, for some simple
> implementations. The pg_autovacuum evolution path is a good model - if
> it works and the code is stable, bring it under the postmaster at a
> later time.

[ This email isn't focused because I haven't resolved all my ideas yet.]

OK, I looked over the code. Basically it appears pg_arch is a
client-side program that copies files from pg_xlog to a specified
directory, and marks completion in a new pg_rlog directory.

The driving part of the program seems to be:

while ( (n = read( xlogfd, buf, BLCKSZ)) > 0)
if ( write( archfd, buf, n) != n)
return false;

The program basically sleeps and when it awakes checks to see if new WAL
files have been created.

There is some additional GUC variable to prevent WAL from being recycled
until it has been archived, but the posted patch only had pg_arch.c, its
Makefile, and a patch to update bin/Makefile.

Simon (the submitter) specified he was providing an API to archive, but
it is really just a set of C routines to call that do copies. It is not
a wire protocol or anything like that.

The program has a mode where it archives all available wal files and
exits, but by default it has to remain running to continue archiving.

I am wondering if this is the way to approach the situation. I
apologize for not considering this earlier. Archives of PITR postings
of interest are at:

http://momjian.postgresql.org/cgi-bin/pgtodo?pitr

It seems the backend is the one who knows right away when a new WAL file
has been created and needs to be archived.

Also, are folks happy with archiving only full WAL files? This will not
restore all transactions up to the point of failure, but might lose
perhaps 2-5 minutes of transactions before the failure.

Also, a client application is a separate process that must remain
running. With Informix, there is a separate utility to do PITR logging.
It is a pain to have to make sure a separate process is always running.

Here is an idea. What if we add two GUC settings:

pitr = true/false;
pitr_path = 'filename or |program';

In this way, you would basically specify your path to dump all WAL logs
into (just keep appending 16MB chunks) or call a program that you pipe
all the WAL logs into.

You can't change pitr_path while pitr is on. Each backend opens the
filename in append mode before writing. One problem is that this slows
down the backend because it has to do the write, and it might be slow.

We also need the ability to write to a tape drive, and you can't
open/close those like a file. Different backends will be doing the WAL
file additions, there isn't a central process to keep a tape drive file
descriptor open.

Seems pg_arch should at least use libpq to connect to a database and do
a LISTEN and have the backend NOTIFY when they create a new WAL file or
something. Polling for new WAL files seems non-optimal, but maybe a
database connection is overkill.

Then, you start the backend, specify the path, turn on pitr, do the tar,
and you are on your way.

Also, pg_arch should only be run the the install user. No need to allow
other users to run this.

Another idea is to have a client program like pg_ctl that controls PITR
logging (start, stop, location), but does its job and exits, rather than
remains running.

I apologies for not bringing up these issues earlier. I didn't realize
the direction it was going. I wasn't focused on it. Sorry.

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2004-04-29 04:31:22 Re: linked list rewrite
Previous Message Christopher Kings-Lynne 2004-04-29 01:41:15 Re: Nasty security bug with clustering