Log archiving failing. Seems to be wrong timeline

From: Chris Lewis <clewis(at)inview(dot)co(dot)uk>
To: <pgsql-general(at)postgresql(dot)org>
Subject: Log archiving failing. Seems to be wrong timeline
Date: 2016-06-30 10:53:24
Message-ID: 5774FA24.6060600@inview.co.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hello,

We have 2 postgresql servers (v 9.4.2) master and slave in streaming
replication. The overall cluster is controlled using pacemaker &
corosync and the pgsql cluster agent which handles failover to, and
promotion of, the slave.

Recently a failover occured and I noticed that log archiving was failing
on the master:

cp: cannot stat 'pg_xlog/000000020000000000000002': No such file or
directory
2016-06-30 11:49:48 BST [13816]: [1235-1] db=,user=,client= LOG: archive
command failed with exit code 1
2016-06-30 11:49:48 BST [13816]: [1236-1] db=,user=,client= DETAIL: The
failed archive command was: cp pg_xlog/000000020000000000000002
/mnt/pgsql/data/pg_archive/000000020000000000000002
cp: cannot stat 'pg_xlog/000000020000000000000002': No such file or
directory
2016-06-30 11:49:49 BST [13816]: [1237-1] db=,user=,client= LOG: archive
command failed with exit code 1
2016-06-30 11:49:49 BST [13816]: [1238-1] db=,user=,client= DETAIL: The
failed archive command was: cp pg_xlog/000000020000000000000002
/mnt/pgsql/data/pg_archive/000000020000000000000002
2016-06-30 11:49:49 BST [13816]: [1239-1] db=,user=,client= WARNING:
archiving transaction log file "000000020000000000000002" failed too
many times, will try again later

But the timeline we're on is different:

# /usr/lib/postgresql/9.4/bin/pg_controldata /mnt/pgsql/data
pg_control version number: 942
Catalog version number: 201409291
Database system identifier: 6198394727571912088
Database cluster state: in production
pg_control last modified: Thu 30 Jun 2016 11:42:42 BST
Latest checkpoint location: 2/EEE842E8
Prior checkpoint location: 2/EED64F68
Latest checkpoint's REDO location: 2/EEE4B610
Latest checkpoint's REDO WAL file: 0000002C00000002000000EE
Latest checkpoint's TimeLineID: 44
Latest checkpoint's PrevTimeLineID: 44
Latest checkpoint's full_page_writes: on
Latest checkpoint's NextXID: 0/2947680
Latest checkpoint's NextOID: 74375
Latest checkpoint's NextMultiXactId: 464
Latest checkpoint's NextMultiOffset: 929
Latest checkpoint's oldestXID: 677
Latest checkpoint's oldestXID's DB: 1
Latest checkpoint's oldestActiveXID: 2947680
Latest checkpoint's oldestMultiXid: 1
Latest checkpoint's oldestMulti's DB: 1
Time of latest checkpoint: Thu 30 Jun 2016 11:42:27 BST
Fake LSN counter for unlogged rels: 0/1
Minimum recovery ending location: 0/0
Min recovery ending loc's timeline: 0
Backup start location: 0/0
Backup end location: 0/0
End-of-backup record required: no
Current wal_level setting: hot_standby
Current wal_log_hints setting: off
Current max_connections setting: 250
Current max_worker_processes setting: 8
Current max_prepared_xacts setting: 10
Current max_locks_per_xact setting: 64
Maximum data alignment: 8
Database block size: 8192
Blocks per segment of large relation: 131072
WAL block size: 8192
Bytes per WAL segment: 16777216
Maximum length of identifiers: 64
Maximum columns in an index: 32
Maximum size of a TOAST chunk: 1996
Size of a large-object chunk: 2048
Date/time type storage: 64-bit integers
Float4 argument passing: by value
Float8 argument passing: by value
Data page checksum version: 0

Why are we trying to archive logs which belong to an old timeline?

Any thoughts much appreciated.

Regards

Chris

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Kaixi Luo 2016-06-30 12:00:16 Re: How safe is pg_basebackup + continuous archiving?
Previous Message Alex Ignatov 2016-06-30 10:19:52 Re: How safe is pg_basebackup + continuous archiving?