BUG #14119: Database restore aborts with permission denied message

From: Jeffrey(dot)Marshall(at)usitc(dot)gov
To: pgsql-bugs(at)postgresql(dot)org
Subject: BUG #14119: Database restore aborts with permission denied message
Date: 2016-04-28 13:28:28
Message-ID: 20160428132828.19110.9264@wrigleys.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

The following bug has been logged on the website:

Bug reference: 14119
Logged by: Jeff Marshall
Email address: Jeffrey(dot)Marshall(at)usitc(dot)gov
PostgreSQL version: 9.2.16
Operating system: RHEL 6.7
Description:

DESCRIPTION: When performing a database restore using the 9.2.16
executables, the restore aborts with a permission denied message. The error
occurred restoring both backups that were made under 9.2.16 as well as
backups made under prior versions (9.2.15 at least).

ENVIRONMENT:

PostgreSQL 9.2.16

[postgres(at)postsand01 pg_xlog]$ psql --version
psql (PostgreSQL) 9.2.16
[postgres(at)postsand01 pg_xlog]$ psql
psql (9.2.16)
Type "help" for help.

admin_db=# select version();
version
----------------------------------------------------------------------------------------------------------------
PostgreSQL 9.2.16 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.4.7
20120313 (Red Hat 4.4.7-16), 64-bit
(1 row)

Operating System Information:

[postgres(at)postsandXX pg_xlog]$ uname -a
Linux postXX.usitc.gov 2.6.32-573.22.1.el6.x86_64 #1 SMP Thu Mar 17 03:23:39
EDT 2016 x86_64 x86_64 x86_64 GNU/Linux

The Postgres executables were downloaded/installed and not compiled
locally.

STEPS TAKEN TO PRODUCE ERROR:

The restores being performed are not to a specific point in time (i.e.
restore the database as of 4/18/16 09:59:17); the restores are just loading
the database as of the completion of the last full backup.

Perform a database backup:

select pg_start_backup('backup_2016_04_19');

tar and compress the tablespace directories (2 tablespace directories in
this case) and the cluster directory
save any WAL files created during the backup

select pg_stop_backup();

Restore the database:

Remove tablespace directories
Remove cluster directory
Uncompress/extract the tablespace directories
Uncompress/extract the cluster directory
Remove files from pg_xlog directory
Create a recovery.conf file that points to the directory with the WAL
archive files created during the backup

restart the server using the 9.2.16 executables

pg_ctl -D /pgdata/pgsrv_cluster_data/postXX_rest_92 start

The restore process aborts with the following messages in the database
server error log:

2016-04-27 17:02:05 EDT 572128cd.1811 [1-1] user=,db=,remote= LOG: 00000:
database system was interrupted; last known up at

2016-04-19 14:18:03 EDT
2016-04-27 17:02:05 EDT 572128cd.1811 [2-1] user=,db=,remote= LOCATION:
StartupXLOG, xlog.c:6372
2016-04-27 17:02:05 EDT 572128cd.1811 [3-1] user=,db=,remote= LOG: 00000:
starting archive recovery
2016-04-27 17:02:05 EDT 572128cd.1811 [4-1] user=,db=,remote= LOCATION:
StartupXLOG, xlog.c:6465
2016-04-27 17:02:06 EDT 572128cd.1811 [5-1] user=,db=,remote= LOG: 00000:
restored log file "0000000100000006000000FE" from

archive
2016-04-27 17:02:06 EDT 572128cd.1811 [6-1] user=,db=,remote= LOCATION:
RestoreArchivedFile, xlog.c:3267
2016-04-27 17:02:06 EDT 572128cd.1811 [7-1] user=,db=,remote= FATAL: 42501:
could not open file "pg_xlog/RECOVERYXLOG":

Permission denied
2016-04-27 17:02:06 EDT 572128cd.1811 [8-1] user=,db=,remote= LOCATION:
fsync_fname_ext, fd.c:2654
2016-04-27 17:02:06 EDT 572128cc.180f [3-1] user=,db=,remote= LOG: 00000:
startup process (PID 6161) exited with exit code 1
2016-04-27 17:02:06 EDT 572128cc.180f [4-1] user=,db=,remote= LOCATION:
LogChildExit, postmaster.c:3008
2016-04-27 17:02:06 EDT 572128cc.180f [5-1] user=,db=,remote= LOG: 00000:
aborting startup due to startup process failure
2016-04-27 17:02:06 EDT 572128cc.180f [6-1] user=,db=,remote= LOCATION:
reaper, postmaster.c:2461

We also have strace output for the failing PID (6161)

NOTES:

After uncompressing/etracting the tablespace and cluster directories and
creating the recovery.conf file, if the server is started with the 9.2.15
executables the server starts and recovers successfully (i.e restoring the
exact same files works under 9.2.15 and fails under 9.2.16). Log file
entries for the successful restore are below:

2016-04-27 17:06:13 EDT 572129c5.19c5 [1-1] user=,db=,remote= LOG: 00000:
database system was interrupted; last known up at

2016-04-19 14:18:03 EDT
2016-04-27 17:06:13 EDT 572129c5.19c5 [2-1] user=,db=,remote= LOCATION:
StartupXLOG, xlog.c:6417
2016-04-27 17:06:13 EDT 572129c5.19c5 [3-1] user=,db=,remote= LOG: 00000:
starting archive recovery
2016-04-27 17:06:13 EDT 572129c5.19c5 [4-1] user=,db=,remote= LOCATION:
StartupXLOG, xlog.c:6510
2016-04-27 17:06:14 EDT 572129c5.19c5 [5-1] user=,db=,remote= LOG: 00000:
restored log file "0000000100000006000000FE" from

archive
2016-04-27 17:06:14 EDT 572129c5.19c5 [6-1] user=,db=,remote= LOCATION:
RestoreArchivedFile, xlog.c:3294
2016-04-27 17:06:14 EDT 572129c5.19c5 [7-1] user=,db=,remote= LOG: 00000:
redo starts at 6/FE000080
2016-04-27 17:06:14 EDT 572129c5.19c5 [8-1] user=,db=,remote= LOCATION:
StartupXLOG, xlog.c:6966
2016-04-27 17:06:14 EDT 572129c5.19c5 [9-1] user=,db=,remote= LOG: 00000:
consistent recovery state reached at 6/FE002680
2016-04-27 17:06:14 EDT 572129c5.19c5 [10-1] user=,db=,remote= LOCATION:
CheckRecoveryConsistency, xlog.c:7522
cp: cannot stat `/pgdata/postXX_orig_wal/000000010000000700000000': No such
file or directory
2016-04-27 17:06:14 EDT 572129c5.19c5 [11-1] user=,db=,remote= LOG: 58P01:
could not open file

"pg_xlog/000000010000000700000000" (log file 7, segment 0): No such file or
directory
2016-04-27 17:06:14 EDT 572129c5.19c5 [12-1] user=,db=,remote= LOCATION:
XLogFileReadAnyTLI, xlog.c:2970
2016-04-27 17:06:14 EDT 572129c5.19c5 [13-1] user=,db=,remote= LOG: 00000:
redo done at 6/FE002680
2016-04-27 17:06:14 EDT 572129c5.19c5 [14-1] user=,db=,remote= LOCATION:
StartupXLOG, xlog.c:7104
2016-04-27 17:06:14 EDT 572129c5.19c5 [15-1] user=,db=,remote= LOG: 00000:
last completed transaction was at log time 2016-

04-19 14:18:07.052246-04
2016-04-27 17:06:14 EDT 572129c5.19c5 [16-1] user=,db=,remote= LOCATION:
StartupXLOG, xlog.c:7109
2016-04-27 17:06:14 EDT 572129c5.19c5 [17-1] user=,db=,remote= LOG: 00000:
restored log file "0000000100000006000000FE" from

archive
2016-04-27 17:06:14 EDT 572129c5.19c5 [18-1] user=,db=,remote= LOCATION:
RestoreArchivedFile, xlog.c:3294
cp: cannot stat `/pgdata/postXX_orig_wal/00000002.history': No such file or
directory
2016-04-27 17:06:14 EDT 572129c5.19c5 [19-1] user=,db=,remote= LOG: 00000:
selected new timeline ID: 2
2016-04-27 17:06:14 EDT 572129c5.19c5 [20-1] user=,db=,remote= LOCATION:
StartupXLOG, xlog.c:7222
cp: cannot stat `/pgdata/postXX_orig_wal/00000001.history': No such file or
directory
2016-04-27 17:06:15 EDT 572129c5.19c5 [21-1] user=,db=,remote= LOG: 00000:
archive recovery complete
2016-04-27 17:06:15 EDT 572129c5.19c5 [22-1] user=,db=,remote= LOCATION:
exitArchiveRecovery, xlog.c:5892
2016-04-27 17:06:16 EDT 572129c8.19ce [1-1] user=,db=,remote= LOG: 00000:
autovacuum launcher started
2016-04-27 17:06:16 EDT 572129c8.19ce [2-1] user=,db=,remote= LOCATION:
AutoVacLauncherMain, autovacuum.c:410
2016-04-27 17:06:16 EDT 572129c4.19c3 [3-1] user=,db=,remote= LOG: 00000:
database system is ready to accept connections
2016-04-27 17:06:16 EDT 572129c4.19c3 [4-1] user=,db=,remote= LOCATION:
reaper, postmaster.c:2539
2016-04-27 17:06:42 EDT 572129c4.19c3 [5-1] user=,db=,remote= LOG: 00000:
received smart shutdown request
2016-04-27 17:06:42 EDT 572129c4.19c3 [6-1] user=,db=,remote= LOCATION:
pmdie, postmaster.c:2264
2016-04-27 17:06:42 EDT 572129c8.19ce [3-1] user=,db=,remote= LOG: 00000:
autovacuum launcher shutting down
2016-04-27 17:06:42 EDT 572129c8.19ce [4-1] user=,db=,remote= LOCATION:
AutoVacLauncherMain, autovacuum.c:780
2016-04-27 17:06:42 EDT 572129c6.19c7 [1-1] user=,db=,remote= LOG: 00000:
shutting down
2016-04-27 17:06:42 EDT 572129c6.19c7 [2-1] user=,db=,remote= LOCATION:
ShutdownXLOG, xlog.c:7957
2016-04-27 17:06:42 EDT 572129c6.19c7 [3-1] user=,db=,remote= LOG: 00000:
database system is shut down
2016-04-27 17:06:42 EDT 572129c6.19c7 [4-1] user=,db=,remote= LOCATION:
ShutdownXLOG, xlog.c:7979

Browse pgsql-bugs by date

  From Date Subject
Next Message Thom Brown 2016-04-28 16:41:29 Re: [BUGS] Breakage with VACUUM ANALYSE + partitions
Previous Message Michael Paquier 2016-04-28 12:38:40 Re: alter table .. reset not throwing an error if storage_parameter name is wrong