Re: Consistent file-level backup of pg data directory

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: gator_ml(at)yahoo(dot)de
Cc: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Consistent file-level backup of pg data directory
Date: 2014-01-10 18:59:56
Message-ID: CAMkU=1wybOpunQ-_Z5U=Cd_Q7ohSxOHLgDwvN60HSAUKehj25w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Fri, Jan 10, 2014 at 9:03 AM, <gator_ml(at)yahoo(dot)de> wrote:

> On 01/08/14 19:55, Jeff Janes wrote:
>
>> I think it would be easier to just exclude the database from the
>> system-wide backup and use a different method for it, rather than
>> engineer the necessary before/after hooks onto the system-wide backup.
>>
>
> Thanks for your comments!
>
> I really thought, it would be easier (I still have the
> impression, that almost everything needed to make a filesystem
> backup safe would be available, only the parts don't fit too well
> for that purpose). I now came up with a way how to safely get a
> postgres database into the general backup-procedure:
>
> - move the postgres data directory to a separate LVM volume
> - call pg_start_backup()
> - create a LVM snapshot
> - call pg_stop_backup() and update the snapshot's pg_xlog from
> the "real" volume
> - create a recovery.conf on the snapshot
> - run our normal backup procedure with the "real" postgres data
> volume replaced by the prepared snapshot
> - remove the snapshot again
> - If the backup is restored to a machine, postgres will find the
> recovery.conf file and initiate the recovery
>

Provided the entire database (including all tablespaces, all pg_xlog)
participate in the snapshot atomically, you can skip almost all of those
steps (and you should, because including extraneous steps and confusing and
therefore dangerous):

0) do a one-time permanent rearrangement so the database lives on a
separate volume.
1) snapshot.
2) backup the frozen snapshot.
3) release the snapshot.

When you start postgres based on the copied snapshot, the database will
think that it crashed, and will go through soft crash recovery (recovery by
using the files it finds in pg_xlog). You don't need recovery.conf,
because you are not using a WAL archive, only the WAL that is naturally
left in the pg_xlog. Since you are not doing a hard recovery, you don't
need pg_start_backup(), etc.

> Actually, I wonder if in this scenario, the pg_start/stop_backup
> is really necessary - would it be safe to just tell postgres to
> create a checkpoint right before the snapshot and directly use
> the "frozen" data directory in the backup? (The postgres
> documentation suggests this).
>

Yep. Assuming you trust LVM snapshot code to be free of bugs.

>
> Another point in favor of the slightly cumbersome solution above
> would be the ability to combine it with Point-in-Time Recovery.
>

Now I'm a bit confused. I thought you were not doing actual log archiving,
and did not want to do so. If you are not keeping the log archive, then
you can't use PITR in a meaningful way.

But if you are using real log archiving, then you don't need to use a LVM
snapshot. Half of the point of the dance with pg_start_backup(), the
backup_label file, and the pg_stop_backup() is to protect you from problems
that are caused by the backup not being instantaneous.

> As far as I understood, this only works with a base backup created
> using pg_start/stop_backup. Is this right?

Or with the pg_basebackup program.

It is probably *possible* to do a PITR starting from a LVM snapshot rather
than a base backup, but that is just juggling running chainsaws.

So first decide if you want to use log archiving or not, then backup using
the method corresponding to that decision. Of course if you do use log
archiving, you need to make sure the archive is backed up, too.

Cheers,

Jeff

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Tomas Vondra 2014-01-10 19:40:51 Re: Question about memory usage
Previous Message Anand Kumar, Karthik 2014-01-10 18:45:06 Re: Sudden slow down and spike in system CPU causes max_connections to get exhausted