Re: [PATCHES] Doc update for pg_start_backup

From: Theo Schlossnagle <jesus(at)omniti(dot)com>
To: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
Cc: Theo Schlossnagle <jesus(at)omniti(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgreSQL(dot)org, Simon Riggs <simon(at)2ndquadrant(dot)com>
Subject: Re: [PATCHES] Doc update for pg_start_backup
Date: 2007-06-29 12:33:43
Message-ID: AA3E4295-3D37-4B2D-8B79-42538D9F532A@omniti.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches


On Jun 29, 2007, at 4:25 AM, Heikki Linnakangas wrote:

> Tom Lane wrote:
>> Heikki Linnakangas <heikki(at)enterprisedb(dot)com> writes:
>>> Added a note to the docs that pg_start_backup can take a long
>>> time to finish now that we spread out checkpoints:
>> I was starting to wordsmith this, and then wondered whether it's not
>> just a stupid idea for pg_start_backup to act that way. The reason
>> you're doing it is to take a base backup, right? What are you going
>> to take the base backup with? I do not offhand know of any backup
>> tools that don't suck major amounts of I/O bandwidth.
>
> scp over a network? It's still going to consume a fair amount of I/
> O, but the network could very well be the bottleneck.
>
>> That being
>> the case, you're simply not going to schedule the operation during
>> full-load periods. And that leads to the conclusion that
>> pg_start_backup should just use CHECKPOINT_IMMEDIATE and not slow
>> you down.
>
> That's probably true in most cases. But on a system that doesn't
> have quite periods, you're still going to have to take the backup.
> To be honest, I've never worked as a DBA and never had to deal with
> taking backups of a production system, so my gut feelings on this
> could be totally wrong.

I'll share my two cents having had to back up many terabytes of
oracle, postgres and mysql every day...

The comments that taking a backup causes a lot of absolutely
unavoidable I/O is right on the mark.

If you have a large enough database where this matters the technique
usually looks as follows.

(1) sanity
(2) postgres_start_backup
(3) snap
(4) postgres_stop_backup
(5) backup

Now, the backup will always have to read the data, if it is full it
reads every block. If it is incremental, it reads the blocks that
changed. You will frequently be in the position of performing a full
backup. The bandwidth for doing the read will inevitably happen in
one or more of the above steps. I strongly prefer that load to
happen in (5) and for steps (2,3,4) to happen as quickly as
possible. Right now on our largest (slowest) production box which is
postgres and over a terabyte, steps 2-4 take about 30-60 seconds.
Step 5 takes *cough* about 18 hours *cough*.

The snap in many of our cases is an logical software enabled snapshot
(either Veritas, LVM or ZFS). However, you can use many enterprise
storage to take a hard snapshot and expose that as a LUN to mount
elsewhere on attached to the same SAN. Many confuse this for being
"free". Regardless of how the snap is taken you have to pay for it..
either at snap time, at read time or at release time. Nothing's free.

// Theo Schlossnagle
// Principal(at)OmniTI: http://omniti.com
// Esoteric Curio: http://www.lethargy.org/~jesus/

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jim Nasby 2007-06-29 13:07:34 Re: Bgwriter LRU cleaning: we've been going at this all wrong
Previous Message Simon Riggs 2007-06-29 12:10:38 Configurable Additional Stats

Browse pgsql-patches by date

  From Date Subject
Next Message Jim Nasby 2007-06-29 13:28:47 Re: [PATCHES] Doc update for pg_start_backup
Previous Message Patrick Welche 2007-06-29 12:11:27 Re: configure.in / xml / quoting trouble