Failed archive_command copy - number of attempts configurable?

From: "dan(dot)m(dot)harris" <daniel(dot)harris(at)metaswitch(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Failed archive_command copy - number of attempts configurable?
Date: 2010-11-08 19:01:49
Message-ID: 1289242909183-3255563.post@n5.nabble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general


Hi all,

I'm doing some testing of Postgres 9.0 archiving and streaming replication
between a couple of Solaris 10 servers. Recently I was trying to test how
well the standby server catches up after an outage, and a question arose.

It seems that if the standby is uncontactable by the primary when it is
attempting WAL archiving, the primary will attempt the copy three times,
then log that the log file could not be archived, as there were too many
failures. See:

ssh: connect to host 172.18.131.212 port 22: Connection timed out^M
lost connection
LOG: archive command failed with exit code 1
DETAIL: The failed archive command was: scp
pg_xlog/000000010000000000000006
postgres(at)172(dot)18(dot)131(dot)212:/postgres/postgres/9.0-pgdg/primary_archive
ssh: connect to host 172.18.131.212 port 22: Connection timed out^M
lost connection
LOG: archive command failed with exit code 1
DETAIL: The failed archive command was: scp
pg_xlog/000000010000000000000006
postgres(at)172(dot)18(dot)131(dot)212:/postgres/postgres/9.0-pgdg/primary_archive
ssh: connect to host 172.18.131.212 port 22: Connection timed out^M
lost connection
LOG: archive command failed with exit code 1
DETAIL: The failed archive command was: scp
pg_xlog/000000010000000000000006
postgres(at)172(dot)18(dot)131(dot)212:/postgres/postgres/9.0-pgdg/primary_archive
WARNING: transaction log file "000000010000000000000006" could not be
archived: too many failures

But then the primary retries this another 49 times! So 150 attempts in all.

What I need to know is whether these numbers are configurable? Can they be
timed? How long before the primary stops retrying altogether?

Any help appreciated. Thanks!
Dan
--
View this message in context: http://postgresql.1045698.n5.nabble.com/Failed-archive-command-copy-number-of-attempts-configurable-tp3255563p3255563.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Jakub Ouhrabka 2010-11-08 19:04:32 Re: ERROR: Out of memory - when connecting to database
Previous Message hubert depesz lubaczewski 2010-11-08 18:54:06 Re: need help with Triggers