Re: Backup

From: "Andy Shellam (Mailing Lists)" <andy(dot)shellam-lists(at)mailnetwork(dot)co(dot)uk>
To: pgsql-admin(at)postgresql(dot)org
Subject: Re: Backup
Date: 2006-12-13 11:21:59
Message-ID: 457FE257.8000405@mailnetwork.co.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

This could be implemented as a fairly simple script that does:

- Specify a base backup name (e.g. with date/time)
- Connect to postgres database
- Issue pg_start_backup('base backup name')
- Tar the data cluster directory, excluding pg_xlog
- Issue pg_stop_backup
- Compress the created archive
- Move the created archive to the backup history folder (or tape-drive etc)
- Find the date/time stamp of the latest ".backup" file in the xlog archive
- Remove any WAL files created earlier than this ".backup" file
(excluding the first WAL file needed - specified as START_WAL_LOCATION
(within the .backup file)

This process runs on our main PGSQL server at 2:00 in the morning and
finishes within 4 minutes (on a 2GB database), and our backups can
restore reliably - we have a base backup from 2:00am, and WAL files
maintained throughout the day. We're getting a new WAL file archived
roughly every 3-6 minutes, so should our server crash (or we feel the
need) we can restore up to 6 minutes ago.

The archive_command is just a simple copy - "cp %p /path/to/archive/%f"

There's really nothing difficult, it puts you (the administrator) in
control of every aspect of your backup which is a good thing.

"If this is it, then I'll end up with an old level zero (i.e. full, base )
backup and A LOT of level 1 (i.e. transaction log) backup. I think it should
be more like it's for Informix, where you ask the database to perform an
online level 0 backup (base); after this, it stores transaction logs on disk,
which you can archive with level 1 backup. Then, say everyonce in a week, you
get another level 0 backup, and the database clears the already-archived logs
from disk and starts all over."

This to me is exactly the same approach. If you really want to, you can
tell PostgreSQL to do an online backup, wait a week (saving all WAL
logs), then perform another online backup and remove the previous week's
log files - it's just you'll end up with a lot of log files (dependent
on transaction frequency and archive settings.) You don't have to have
one base backup and then a ton of log files, because it'll take you
forever to restore it, in fact the more often you perform a base backup
the better (I had to restore a 2GB database a while backup, with a
week's worth of WAL files, and it took 10 hours!)

"a lack of -i in cp, for example"

FWIW, -i in copy won't render a backup useless - only if the file to be
archived has already been archived (in which case it stands a chance
it's the same file anyway as PG rotates/recycles WAL file numbers.)

Hope this goes some way to helping you out,

Andy.

Eduardo J. Ortega wrote:
> Hi:
>
> Well, I don't really like the fact that admin has to specify the archiving and
> restoring command; an error here (a lack of -i in cp, for example) may render
> the backup useless. In addition, the backup is performed only everytime the
> WAL file is filled; i need to take consistent backups every hour or so, and I
> am not sure if that time represents more or less than a WAL file ( i could
> still measure that, i guess). Finally, as I understand, the WAL backup method
> works like this:
> 1) Take full base FS backup
> 2) get some way to copy WAL files
>
> If this is it, then I'll end up with an old level zero (i.e. full, base )
> backup and A LOT of level 1 (i.e. transaction log) backup. I think it should
> be more like it's for Informix, where you ask the database to perform an
> online level 0 backup (base); after this, it stores transaction logs on disk,
> which you can archive with level 1 backup. Then, say everyonce in a week, you
> get another level 0 backup, and the database clears the already-archived logs
> from disk and starts all over.
>
> I guess this could be achieved with PG, but it requieres considerably more
> steps (pg_start_backup, pg_stop_backup, manually cleaning old log files which
> could be an error point), or I am getting something wrong. Besides, why do
> you need to tell the database to stop the backup? shouldn't it stop by itself
> when there's no more information to be archived?
>
> Perhaps if any of you has this method working, you could provide me with your
> archiving and restoring commands as well as a description of the whole
> process the way you implemented it.
>
> Thanks,
>
> Eduardo.
>
> On Wednesday 13 December 2006 02:39, Shoaib Mir wrote:
>
>> To me PITR looks like a very nice solution for incremental backup and even
>> they can serve as a warm standby. What exactly are the complications you
>> see with WAL archiving?
>>
>> ---------------
>> Shoaib Mir
>> EnterpriseDB (www.enterprisedb.com)
>>
>> On 12/13/06, Eduardo J. Ortega <ejortegau(at)cable(dot)net(dot)co> wrote:
>>
>>> Hi there:
>>>
>>> Are there any nice (official or third party) backup utilities for
>>> postgres? I
>>> have a database which is several GB, so pg_dumping it to file and then
>>> bzipping2 every hour is not really the way to go. I've read a little
>>> about using WAL for incremental backup, but it sounds a little
>>> complicated and *very* user-error prone.
>>>
>>> (Not sure if this is the right place for it, but i really think that PG's
>>> developers should consider including a viable backup utility in the next
>>> version)
>>>
>>> thanks.
>>>
>>> --
>>> Eduardo J. Ortega - Linux user #222873
>>> "No fake - I'm a big fan of konqueror, and I use it for everything." --
>>> Linus
>>> Torvalds
>>>
>>> ---------------------------(end of broadcast)---------------------------
>>> TIP 2: Don't 'kill -9' the postmaster
>>>
>
>

--
Andy Shellam
NetServe Support Team

the Mail Network
"an alternative in a standardised world"

p: +44 (0) 121 288 0832/0839
m: +44 (0) 7818 000834

In response to

  • Re: Backup at 2006-12-13 10:51:08 from Eduardo J. Ortega

Responses

  • Re: Backup at 2006-12-14 10:28:18 from Eduardo J. Ortega

Browse pgsql-admin by date

  From Date Subject
Next Message Shoaib Mir 2006-12-13 11:22:41 Re: Backup
Previous Message Eduardo J. Ortega 2006-12-13 10:51:08 Re: Backup