Re: Hot Backup with rsync fails at pg_clog if under load

From: Linas Virbalas <linas(dot)virbalas(at)continuent(dot)com>
To: Euler Taveira de Oliveira <euler(at)timbira(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Florian Pflug <fgp(at)phlo(dot)org>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Hot Backup with rsync fails at pg_clog if under load
Date: 2011-09-23 08:02:19
Message-ID: CAA217BB.1E2F3%linas.virbalas@continuent.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 9/22/11 6:59 PM, "Euler Taveira de Oliveira" <euler(at)timbira(dot)com> wrote:

>> If needed, I could do that, if I had the exact procedure... Currently,
>> during the start of the backup I take the following information:
>>
> Just show us the output of pg_start_backup and part of the standby log with
> the following message 'redo starts at' and the subsequent messages up to the
> failure.

Unfortunately, it's impossible, because the error message "Could not read
from file "pg_clog/0001" at offset 32768: Success" is shown (and startup
aborted) before the turn for "redo starts at" message arrives.

For comparison purposes, here is the log from the original message:

2011-09-21 13:41:05 CEST LOG: database system was interrupted; last known
up at 2011-09-21 13:40:50 CEST
Restoring 00000014.history
mv: cannot stat `/opt/PostgreSQL/9.1/archive/00000014.history': No such file
or directory
Restoring 00000013.history
2011-09-21 13:41:05 CEST LOG: restored log file "00000013.history" from
archive
! 2011-09-21 13:41:05 CEST LOG: entering standby mode
Restoring 0000001300000006000000DC
2011-09-21 13:41:05 CEST LOG: restored log file "0000001300000006000000DC"
from archive
Restoring 0000001300000006000000DB
2011-09-21 13:41:05 CEST LOG: restored log file "0000001300000006000000DB"
from archive
2011-09-21 13:41:05 CEST FATAL: could not access status of transaction
1188673
2011-09-21 13:41:05 CEST DETAIL: Could not read from file "pg_clog/0001" at
offset 32768: Success.
2011-09-21 13:41:05 CEST LOG: startup process (PID 13819) exited with exit
code 1
2011-09-21 13:41:05 CEST LOG: aborting startup due to startup process
Failure

As you can see, there is no "redo starts at" message. If we compare this to
a healthy standby startup, we can see that the "pg_clog" error appears after
the "entering standby mode", but before the "redo starts at", hence the
latter is not reached. Healthy log example:

2011-09-23 09:52:31 CEST LOG: database system was interrupted; last known
up at 2011-09-23 09:52:15 CEST
Restoring 00000007.history
mv: cannot stat `/opt/PostgreSQL/9.1/archive/00000007.history': No such file
or directory
Restoring 00000006.history
2011-09-23 09:52:31 CEST LOG: restored log file "00000006.history" from
archive
2011-09-23 09:52:31 CEST LOG: entering standby mode
Restoring 000000060000000000000065
2011-09-23 09:52:31 CEST LOG: restored log file "000000060000000000000065"
from archive
2011-09-23 09:52:31 CEST LOG: redo starts at 0/65000020
2011-09-23 09:52:31 CEST LOG: consistent recovery state reached at
0/66000000
2011-09-23 09:52:31 CEST LOG: database system is ready to accept read only
connections
Restoring 000000060000000000000066
mv: cannot stat `/opt/PostgreSQL/9.1/archive/000000060000000000000066': No
such file or directory
Restoring 00000007.history
mv: cannot stat `/opt/PostgreSQL/9.1/archive/00000007.history': No such file
or directory
2011-09-23 09:52:31 CEST LOG: streaming replication successfully connected
to primary

--
Best regards,
Linas Virbalas
http://flyingclusters.blogspot.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2011-09-23 08:41:28 Re: Hot Backup with rsync fails at pg_clog if under load
Previous Message Jaime Casanova 2011-09-23 05:05:21 Re: [BUGS] BUG #5206: wal_sync_method in stock postgresql.conf may be wrong