From: | Andy Colson <andy(at)squeakycode(dot)net> |
---|---|
To: | PostgreSQL general <pgsql-general(at)postgresql(dot)org> |
Subject: | 9.3 to 9.5 upgrade problems |
Date: | 2016-07-03 15:06:55 |
Message-ID: | f41c4762-362c-7cbb-96fd-a49685d18d50@squeakycode.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Hi all,
I have a master (web1) and two slaves (web2, webserv), one slave is quite far from the master, the db is 112 Gig, so pg_basebackup is my last resort.
I followed the page here:
https://www.postgresql.org/docs/9.5/static/pgupgrade.html
including the rsync stuff. I practiced it _twice_, once in PG 9.5 beta, and again a week ago, on two VM's I created locally. Both practice sessions worked perfect.
I just ran it on the live databases. The master seems ok, its running PG 9.5 now, I can login to it, and no errors in the log.
Neither slave works. After I'd gotten done with the pgupgrade steps, both slaves gave me this error:
FATAL: database system identifier differs between the primary and standby
Sure enough pg_controldata show'd their database system id different (all three web1, web2, webserv were different. no matches at all), so I'm assuming the rsync didnt rsync right, or I missed a step and ran it to early, or something ... I'm not quite sure.
I needed to get the live website back up and running again, so I let the master go, ran analyze, and when it was finished, used the steps here to try and resync:
https://wiki.postgresql.org/wiki/Binary_Replication_Tutorial
on Master:
select pg_start_backup('clone',true);
rsync -av --exclude pg_xlog --exclude postgresql.conf /pub/pg95/* web2:/pub/pg95/
select pg_stop_backup();
rsync -av /pub/pg95/pg_xlog web2:/pub/pg95/
That ran pretty quick, and pg_controldata shows matching numbers, but when I start the slave I get:
,,2016-07-03 06:06:57.173 CDT,: LOG: entering standby mode
,,2016-07-03 06:06:57.205 CDT,: LOG: redo starts at 369/D6002228
,,2016-07-03 06:06:57.984 CDT,: LOG: consistent recovery state reached at 369/DCC5DB90
,,2016-07-03 06:06:57.984 CDT,: LOG: database system is ready to accept read only connections
,,2016-07-03 06:06:57.984 CDT,: LOG: invalid record length at 369/DD038ED0
,,2016-07-03 06:06:58.344 CDT,: LOG: started streaming WAL from primary at 369/DD000000 on timeline 1
web,[unknown],2016-07-03 06:07:11.176 CDT,[local]: FATAL: role "andy" does not exist
I can login as myself on the master, but not on the slave. when I "psql -U postgres" on the slave I get:
psql: FATAL: cache lookup failed for database 16401
This is only on web2, its close to web1, so I'm hoping I can get it fixed and then rsync it quickly to the far away slave.
I'm at a loss here, any hints or suggestions would be appreciated.
Thanks,
-Andy
From | Date | Subject | |
---|---|---|---|
Next Message | Vick Khera | 2016-07-03 15:11:45 | Re: 9.3 to 9.5 upgrade problems |
Previous Message | Mark Morgan Lloyd | 2016-07-03 15:02:11 | Re: Stored procedure version control |