Skip site navigation (1) Skip section navigation (2)

Re: BUG #3110: Online Backup introduces Duplicate OIDs

From: Randy Isbell <jisbell(at)cisco(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #3110: Online Backup introduces Duplicate OIDs
Date: 2007-03-12 18:46:35
Message-ID: 90F94732-716D-4D27-B219-073EEE81CE13@cisco.com (view raw or flat)
Thread:
Lists: pgsql-bugs
On Mar 10, 2007, at 5:02 PM, Tom Lane wrote:

> "Randy Isbell \(jisbell\)" <jisbell(at)cisco(dot)com> writes:
>> I found that if I take an offline backup created around the same  
>> time as
>> my online backup, roll forward the transaction log files included  
>> in the
>> offline backup using a recovery.conf file, the duplicate records  
>> do NOT
>> exist.
>> Therefore it seems there is no corruption in the WAL files.  The  
>> problem
>> must be in the PITR processing of the online backup file.
>
> ... or there's something wrong with your backup procedure.
>
> I hadn't looked closely at that point before, but I see you  
> describe it
> as
>
>> 3. Issue pg_start_backup()
>> 4. Save off the data cluster
>> 5. Issue pg_stop_backup()
>> 6. Collect the WAL files
>> 7. Create a big hairy tar file with the stuff from items 4 and 6.
>> 8. Take the big hairy tar file to another server running the same pg
>> 8.2.3, untar and start postgres
>
> AFAICS this procedure is *not* suggested anywhere in our  
> documentation.

My description is abbreviated, but the process follows exactly that  
which is documented in Chapter 23.3 "Continuous Archiving and Point- 
In-Time Recovery".  I use the term "online backup" to refer to this  
process.

> What's bothering me about it is that I don't see anything guaranteeing
> that you have a full set of WAL files back to pg_start_backup().  If
> checkpoints occur during step 4, as is virtually certain given you say
> step 4 takes 20 minutes, then WAL files you need will get
> renamed/recycled.  What are you doing to "collect the WAL files"
> exactly?
>

"archive_command" is defined in postgresql.conf in order to save the  
WAL files to an archive location.  The start and end WAL file names  
are obtained from the backup history file.  All WAL files between the  
first and last inclusive are saved with the data cluster snapshot.   
In any event, starting postgres with this saved cluster initiates the  
recovery process and postgres does come up ready.

One thing I do find interesting:  while the backup created with my  
"online" / Chap 23.3 processing contains 18 WAL files, only the last  
3 (most recent) are used during the recovery process.  Is that normal?

> Also, what do you consider to be an "offline backup", and what's
> different in your process for creating that?

My term "offline backup" refers to the backup process documented in  
Chapter 23.2 "File System Level Backup".  In this backup postgres is  
shutdown and a snapshot of the data cluster is taken.

It seems odd that I'm the only person having this problem.  My  
database is not that big (10MM tuples in 100 tables or so).  So I  
question my processing or the environment.   After reviewing the  
process I find my backup procedure matches the documentation.  Could  
this be FreeBSD 6.1 or the filesystem?  How many folks are using  
FreeBSD 6.1 and UFS2?

- r.

In response to

Responses

pgsql-bugs by date

Next:From: Magnus HaganderDate: 2007-03-12 19:00:22
Subject: Re: BUG #3131: on win pg_connect() is very slow
Previous:From: Magnus HaganderDate: 2007-03-12 16:50:58
Subject: BUG #3147: Test bug report

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group