pg_upgrade: delete_old_cluster.sh issues

From: Marc Mamin <M(dot)Mamin(at)intershop(dot)de>
To: "'pgsql-hackers(at)postgresql(dot)org'" <pgsql-hackers(at)postgresql(dot)org>
Subject: pg_upgrade: delete_old_cluster.sh issues
Date: 2013-11-12 10:35:58
Message-ID: B6F6FD62F2624C4C9916AC0175D56D880CE46DB7@jenmbs01.ad.intershop.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello,

IMHO, there is a serious issue in the script to clean the old data directory
when running pg_upgrade in link mode.

in short: When working with symbolic links, the first step in delete_old_cluster.sh
is to delete the old $PGDATA folder that may contain tablespaces used by the new instance.

in long, our use case:

our postgres data directories are organized as follow:

1) they are all registered in a root location, i.e. /opt/data,
but can be located somewhere else using symbolic links:

ll /opt/app/
...
postgresql-data-1 -> /pgdata/postgresql-data-1

2) we have fixed names for root locations of tablespaces within $PGDATA.
these can be real folders or again symbolic links to some other places:

ll /pgdata/postgresql-data-1
...
tblspc_data
tblspc_idx -> /datarep/pg1/tblspc_idx

(additionally, each schema has its own tablespaces in these locations, but this is not relevant here)

3 ) we do have some custom content within $PGDATA. e.g. an extra log folder used by our deployment script

After running pg_upgrade, checking the tablespace location within the NEW instance:

ll pg_tblspc

16428 -> /opt/app/postgresql-data-1/tblspc_data/foo
16429 -> /opt/app/postgresql-data-1/tblspc_idx/foo

which, resolving the symbolic links is equivalent to:

/pgdata/postgresql-data-1/tblspc_data/foo (x)
/datarep/pg1/tblspc_idx/foo (y)

I called pg_upgrade using the true paths (no symbolic links):

./pg_upgrade \
--link\
--check\
--old-datadir "/pgdata/postgresql-data-1"\
--new-datadir "/pgdata/postgresql_93-data-1"

now, checking what the cleanup script would like to do:

cat delete_old_cluster.sh
#!/bin/sh

(a) rm -rf /pgdata/postgresql-data-1
(b) rm -rf /opt/app/postgresql-data-1/tblspc_data/foo/PG_9.1_201105231
(c) rm -rf /opt/app/postgresql-data-1/tblspc_err_data/foo/PG_9.1_201105231

a: will delete the folder (x) which contains data for the NEW Postgres instance !
b: already gone through (a)
c: still exists in /datarep/pg1/tblspc_idx/foo but can't be found
as the symbolic link in /pgdata/postgresql-data-1 is already deleted through (a)

moreover, our custom content in $OLD_PGATA would be gone too

It seems that these issues could all be avoided
while first removing the expected content of $OLD_PGATA
and then only unlink $OLD_PGATA itself when empty
(or add a note in the output of pg_restore):

replace

rm -rf /pgdata/postgresql-data-1

with

cd /pgdata/postgresql-data-1
rm -rf base
rm -rf global
rm -rf pg_clog
rm -rf pg_hba.conf (*)
rm -rf pg_ident.conf (*)
rm -rf pg_log
rm -rf pg_multixact
rm -rf pg_notify
rm -rf pg_serial
rm -rf pg_stat_tmp
rm -rf pg_subtrans
rm -rf pg_tblspc
rm -rf pg_twophase
rm -rf PG_VERSION (*)
rm -rf pg_xlog
rm -rf postgresql.conf (*)
rm -rf postmaster.log
rm -rf postmaster.opts (*)

(*): could be nice to keep as a reference.

best regards,

Marc Mamin

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Rohit Goyal 2013-11-12 11:00:32 Information about Access methods
Previous Message Dimitri Fontaine 2013-11-12 09:40:04 Re: Extension Templates S03E11