Re: PITR Functional Design v2 for 7.5

From: Andreas Pflug <pgadmin(at)pse-consulting(dot)de>
To: josh(at)agliodbs(dot)com
Cc: pgsql-hackers(at)postgresql(dot)org, Simon Riggs <simon(at)2ndquadrant(dot)com>
Subject: Re: PITR Functional Design v2 for 7.5
Date: 2004-03-09 21:19:05
Message-ID: 404E34C9.7000703@pse-consulting.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Josh Berkus wrote:

>5) Full back-up
>
>Related to the above, what I don't see in your paper or the proposed API is a
>way to coordinate full back-ups and WAL archiving. Obviously, the PITR
>Archive is only useful in reference to an existing full backup, so it is
>important to be able to associate a set of PITR archives with a particular
>full backup, or with some kind of "backup checkpoint". I'm sure that you
>have a solution for this, I just didn't see it explained in your proposal, or
>didn't understand it.
>

As far as I understand , full backup in the sense of pgsql means all
data files including c_log where all transactions before the checkpoint
are completely written to the data files. AFAICS there is a small detail
missing so far.

When I'm doing a file level hot backup, I can't be sure about the backup
order. To be sure the cluster is in a consistent state regarding
checkpoints, pg_clog must be the first directory backed up. If this
isn't made sure, the situation could arise that the backed up clog
version contains a checkpoint which marks a transaction completed that
has been written to a file which was backed up earlier than the data
write took place.

This could be insured by doing the backup in two steps; first backing up
pg_clog, and then the rest, restore being performed in the opposite
order. But this seems to be not too fail safe, what if the admin doesn't
know this/forgot about it? So IMHO a mechanism insuring this would be
better. I could think of a solution where a second pg_clog directory is
used, and a pgsql api for that which is called right before performing
the file backup. Josh calls this second pg_clog the "backup checkpoint".

At the moment, a restart is done from clog + WAL, where clog might be
too new in a hot backup situation as mentioned above. There should be a
second pgsql restart mode, where checkpoints are not taken from that
current clog, but the "backup checkpoint clog" which was created
explicitely at backup time. This is somewhat similar to MSSQL's backup
behaviour, where the transaction log (=WAL) is growing until a full
backup has been performed successfully.

Regards,
Andreas

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Treat 2004-03-09 21:22:02 Re: pgFoundry WAS: On pgweb project
Previous Message Lee Kindness 2004-03-09 21:15:39 Re: ECPG - Specifying connections, TSD, sqlca.