Skip site navigation (1) Skip section navigation (2)

Re: [GENERAL] 8.1.4 - problem with PITR - .backup.done /

From: Rafael Martinez <r(dot)m(dot)guerrero(at)usit(dot)uio(dot)no>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: [GENERAL] 8.1.4 - problem with PITR - .backup.done /
Date: 2006-05-30 21:01:32
Message-ID: 1149022892.980.60.camel@linux.site (view raw or flat)
Thread:
Lists: pgsql-generalpgsql-hackers
On Tue, 2006-05-30 at 15:38 -0400, Tom Lane wrote:
[.......]
> 
> My thought is that the stat()s on the .done file failed for some obscure
> reason, perhaps insufficient kernel resources, even though the file was 
> actually there.
> 
> If you have postmaster log output for the interval in which this
> happened, it would be interesting to look for occurrences of this
> warning message from pgarch_archiveDone:
> 
>     if (rename(rlogready, rlogdone) < 0)
>         ereport(WARNING,
>                 (errcode_for_file_access(),
>                  errmsg("could not rename file \"%s\" to \"%s\": %m",
>                         rlogready, rlogdone)));
> 
> If you find any then we might need a different theory ...
> 

I do not find any warning message "could not rename file ...". These are
the relevant entries in the log file:

--------------------------------------------------------
[2006-05-29 17:31:55.212 CEST]   12022 LOG:  archived transaction log
file "00000001000000080000000F"

**** PITR_basebackup script started around 17:32 ****

[2006-05-29 17:40:27.735 CEST]   12022 LOG:  archived transaction log
file "000000010000000800000010"
[2006-05-29 17:49:32.075 CEST]   12022 LOG:  archived transaction log
file "000000010000000800000011"
[2006-05-29 17:59:40.575 CEST]   12022 LOG:  archived transaction log
file "000000010000000800000012"
[2006-05-29 18:08:27.229 CEST]   12022 LOG:  archived transaction log
file "000000010000000800000013"
[2006-05-29 18:11:36.434 CEST]   12022 LOG:  archived transaction log
file "000000010000000800000010.0006D5E8.backup"

[2006-05-29 18:11:36.467 CEST]   12022 LOG:  archive command
"archive_wal.sh -P pg_xlog/000000010000000800000010.0006D5E8.backup -F
000000010000000800000010.0006D5E8.backup" failed: return code 256

[2006-05-29 18:11:37.479 CEST]   12022 LOG:  archive command
"archive_wal.sh -P pg_xlog/000000010000000800000010.0006D5E8.backup -F
000000010000000800000010.0006D5E8.backup" failed: return code 256

[2006-05-29 18:11:38.492 CEST]   12022 LOG:  archive command
"archive_wal.sh -P pg_xlog/000000010000000800000010.0006D5E8.backup -F
000000010000000800000010.0006D5E8.backup" failed: return code 256

[2006-05-29 18:11:38.492 CEST]   12022 WARNING:  transaction log file
"000000010000000800000010.0006D5E8.backup" could not be archived: too
many failures

**** PITR_basebackup script finnished 18:12:16 ****
...............................
**** Same error several times until we deleted the .backup.ready file at
18:15 ****

[2006-05-29 18:19:14.546 CEST]   12022 LOG:  archived transaction log
file "000000010000000800000014"
[2006-05-29 18:30:10.939 CEST]   12022 LOG:  archived transaction log
file "000000010000000800000015"
...............................
--------------------------------------------------------

Our PITR_basebackup script does this:

* Checks if Backup label file exists
* Starts Backup process with pg_start_backup()
* Creates a LVM2 Snapshot of data partition
* Mounts the Snapshot partition
* Creates a tar.bz2 file of data
* Umounts Snapshot partition
* Removes Snapshot LV
* Backup last WAL file not yet archived
* Stops Backup process with pg_stop_backup()
* Waits for *.backup file to appear under the archivedir 
* Removes old WAL archived files
* Removes old tar.bz2 data file

To create the tar.bz file and to delete old WAL files can take some
time. The total running time of the PITR_basebackup script was 2412 sec.

If we get the same problem again, I will try to get more information
from the system. As I said in my last e-mail, this has been a one time
problem.

regards,
-- 
Rafael Martinez, <r(dot)m(dot)guerrero(at)usit(dot)uio(dot)no>
Center for Information Technology Services
University of Oslo, Norway

PGP Public Key: http://folk.uio.no/rafael/


In response to

pgsql-hackers by date

Next:From: Martijn van OosterhoutDate: 2006-05-30 21:16:09
Subject: Re: anoncvs still slow
Previous:From: Andrew DunstanDate: 2006-05-30 20:52:09
Subject: Re: Looking for Postgres Developers to fix problem

pgsql-general by date

Next:From: Thomas KellererDate: 2006-05-30 21:05:18
Subject: Re: Best open source tool for database design / ERDs?
Previous:From: Bruno Wolff IIIDate: 2006-05-30 20:52:21
Subject: Re: Restoring databases from a different installment on Windows

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group