Re: [pg_rewind] cp: cannot stat ‘pg_wal/RECOVERYHISTORY’: No such file or directory

From: Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Cc: tushar <tushar(dot)ahuja(at)enterprisedb(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>
Subject: Re: [pg_rewind] cp: cannot stat ‘pg_wal/RECOVERYHISTORY’: No such file or directory
Date: 2019-06-10 11:07:52
Message-ID: CAGz5QC+Bhp0CF42F9_D6V2VQCizhZZrPH1uCLaZoHs5q9sURjw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello,

On Wed, Jun 5, 2019 at 11:55 AM tushar <tushar(dot)ahuja(at)enterprisedb(dot)com> wrote:
I can see two different problems in this setup.

> > 2)Slave Setup -> ./pg_basebackup -PR -X stream -c fast -h 127.0.0.1
> > -U centos -p 5432 -D slave
> > restore_command='cp %p /tmp/archive_dir1/%f'
> > "
> > 7)Modify old master/postgresql.conf file -
> > restore_command='cp %p /tmp/archive_dir1/%f'
When we define a restore command, we tell the server to copy a file a
WAL file from the archive. So, it should be
restore_command='cp tmp/archive_dir1/%f %p'

This is the reason you're getting this following error.
> > cp: cannot stat ‘pg_wal/RECOVERYHISTORY’: No such file or directory
> > cp: cannot stat ‘pg_wal/RECOVERYXLOG’: No such file or directory

> > 2019-05-27 18:55:47.424 IST [25513] FATAL: the database system is
> > starting up
> > 2019-05-27 18:55:47.425 IST [25512] FATAL: could not connect to the
> > primary server: FATAL: the database system is starting up
This case looks interesting.

1. Master is running on port 5432.
2. A standby is created using basebackup with -R option. So, the
pg_basebackup appends the primary connection settings to
postgresql.auto.conf so that the streaming replication can use the
same settings later on.
cat postgresql.auto.conf -> primary_conninfo = 'port=5432'
3. The standby is started in port 5433.
4. Standby is promoted and old master is stopped.
5. Using pg_rewind, the old master is synchronized with the promoted
standby. As part of the process, it has copied the
postgresql.auto.conf of promoted standby in the old master.
6. Now, the old master is configured as a standby but the
postgresql.auto.conf still contains the following settings:
cat postgresql.auto.conf -> primary_conninfo = 'port=5432'
So, the old master tries to connect to the server on port 5432 and
finds itself which is still in recovery.

This can surely be fixed from the script. While configuring the old
master as a standby server, clear/modify the settings in
postgresql.auto.conf. But, it contradicts with the comment in the file
which forbids the user from editing the file.

Any thoughts?
--
Thanks & Regards,
Kuntal Ghosh
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dave Cramer 2019-06-10 11:27:41 Re: Binary support for pgoutput plugin
Previous Message Alex 2019-06-10 11:00:20 Re: Why to index a "Recently DEAD" tuple when creating index