Re: FW: Setting up of PITR system.

From: "Rajesh Kumar Mallah" <mallah(dot)rajesh(at)gmail(dot)com>
To: "Grega Bremec" <gregab(at)p0f(dot)net>
Cc: pgsql-admin(at)postgresql(dot)org
Subject: Re: FW: Setting up of PITR system.
Date: 2006-04-01 04:23:41
Message-ID: a97c77030603312023k7bdc705eif22af429dc3c943d@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

On 3/30/06, Grega Bremec <gregab(at)p0f(dot)net> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: RIPEMD160
>
> Rajesh Kumar Mallah wrote:
> |
> | OK i am posting my full script [ its not heavy programming i guess :) ]
> | shall be grateful if you/someone could review it . (its well
> commented i think)
> | script also carries sample data.
> |
> | it does following
> | 1. takes base backup to a destined folder by rsync
> | 2. waits for .backup file to arrive in archive folder
> | after pg_stop_bacup()
> | 3. searches and removes unwanted archived log files.
> |
> | I have run it many times in my server and it seems to
> | be working fine.
>
> Hello, Rajesh.
>
> Just a couple of comments on the script itself, not what it actually
> does - I never tried WAL archiving before, so I can't comment on that. I
> inserted the comments at relevant points in the script. I'm sorting them
> into three categories, one is just improvements in style, the other is
> optimization and the third is correction of an error.

Dear Grega ,

Thanks for the useful tips and error spotting,
i am incorporating some of them and testing
the script in my server . I have concerns regarding
some of your optimisations that makes the script less
generic , below find my comments.

>
> | ------------------------------------ BEGIN
> | --------------------------------------------
> | #!/bin/bash
> |
> | # folder where base_backup is put
> | BACKUPFOLDER=/mnt/disk3/base_backups
> | today=`date +%d-%m-%Y-%H-%M-%S`
> | PSQL=/opt/usr/local/pgsql/bin/psql
> | RSYNC="/usr/bin/rsync -a"
> | PGDATADIR=/mnt/disk5/pgdatadir
> |
> | # two table spaces.
> |
> | TS1=/mnt/disk4/bigtables
> | TS2=/mnt/disk3/indexspace
>
> (optimization) Since you're using bash, you can use arrays. This could
> be better written as
>
> ~ TS[0]=/mnt/disk5/pgdatadir
> ~ TS[1]=/mnt/disk4/bigtables
> ~ TS[2]=/mnt/disk3/indexspace
>
> or even
>
> ~ TS=(/mnt/disk5/pgdatadir \
> ~ /mnt/disk4/bigtables \
> ~ /mnt/disk3/indexspace)

agreed , already incorporated.

>
> That way, you can add tablespaces at will and just use a while loop to
> back them up, which greatly simplifies adding new tablespaces or moving
> the script somewhere else. See below for how to implement that.
>
> | # folder where *archived* logs are put.
> | WAL_ARCHIVE=/mnt/wal_archive
> |
> | label=base_backup_${today}
> |
> | echo "Executing pg_start_backup with label $label in server ... "
> |
> | # get the checkpoint at which backup starts
> | # the .backup files seems to be bearing this string in it.
> |
> | CP=`$PSQL -q -Upostgres -d template1 -c "SELECT
> | pg_start_backup('$label');" -P tuples_only -P format=unaligned`
> |
> | echo "Begin CheckPoint is $CP" # this contain string like A/681D1214
> |
> | if [ $? -ne 0 ]
> | then
> | echo "PSQL pg_start_backup failed"
> | exit 1;
> | fi
> | echo "pg_start_backup executed successfully"
>
> (style) If you want to capture any error messages pg_start_backup may
> have caused and store them into ${CP}, you should add 2>&1 at the end of
> the psql invocation, see below snippet.

Incorporated it.

>
> (error) Checking for exit status of pg_start_backup using $? at this
> point will never report an error, as you've used echo prior to checking
> what pg_start_backup returned. You should either move the echo below the
> if statement (by adding an "else" clause) or store the exit status of
> pg_start_backup into RVAL like this:
>
> ~ CP="`$PSQL ... 2>&1`"
> ~ RVAL=$?
> ~ echo "Begin CheckPoint says: ${CP}"
> ~ if [ ${RVAL} -ne 0 ]; then
> ~ ...
> ~ fi
>
> | echo "RSYNC begins.."
> |
> | # rsync each of the folders to the backup folder.
> | for i in $TS1 $TS2 $PGDATADIR ;
> | do
> | echo "Syncing $i .. "
> | time $RSYNC $i $BACKUPFOLDER
> | echo "Done"
> | done
>
> (optimization) If you store locations into an array, you could rewrite
> this as follows:
>
> ~ CTR=0
> ~ while [ -n "${TS[${CTR}]}" ]; do
> ~ echo "Syncing ${TS[${CTR}]}..."
> ~ time ${RSYNC} ${TS[${CTR}]} ${BACKUPFOLDER}
> ~ RVAL=$?
> ~ echo "Sync finished with exit status ${RVAL}"
> ~ if [ ${RVAL} -ne 0 ]; then
> ~ <handle errors>
> ~ fi
> ~ CTR=$((CTR + 1))
> ~ done
> ~ unset CTR
>
> | # fortunately rsync does *not* seems to be exitting with non zero exit
> code
> | # for expected file disappearances and modifications.
> | if [ $? -ne 0 ]
> | then
> | echo "RSYNC failed"
> | exit 1;
> | fi
>
> (error) Same error as above - what you're checking here is whether the
> last command in the last for loop run was successful, and this is always
> going to be true as echoing to stdout will never fail until stdout is
> closed for some reason.

Yes it was an error, i am doing repeat runs of the script
to find the non zero exit codes which should be treated
as normal in context of taking base backups.

>
> | echo "RSYNC Done successfully"
> |
> | echo "Executing pg_stop_backup in server ... "
> | $PSQL -Upostgres template1 -c "SELECT pg_stop_backup();"
> | if [ $? -ne 0 ]
> | then
> | echo "PSQL pg_stop_backup failed"
> | exit 1;
> | fi
> | echo "pg_stop_backup done successfully

I think i should improve the style here.

> | TO_SEARCH="00${CP:4}" # $TO_SEARCH contains 1D1214
> |
> | # now remove the unneeded files.
> |
> | # strip off first 4 chars from CP and prefix 00 to the result.
> | # search the file that has the Checkpoint in its filename.
> | # it takes a while to come, so wait till it comes.
> |
> | while true; do
> | REF_FILE=`ls -1 $WAL_ARCHIVE | grep $TO_SEARCH`
> | if [ ! $REF_FILE ]; then
> | echo "Waitng for file with $TO_SEARCH in $WAL_ARCHIVE
> ... "
> | else
> | break
> | fi
> | sleep 1
> | done
>
> (optimization) You could simplify this significantly using the test builtin:
>
> ~ while [ ! -e ${WAL_ARCHIVE}/*.00${TO_SEARCH}.backup.bz2 ]; do
> ~ echo "Waiting for ${WAL_ARCHIVE}/*.00${TO_SEARCH}.backup.bz2"
> ~ sleep 1
> ~ done
> ~ REF_FILE="`echo ${WAL_ARCHIVE}/*.00${TO_SEARCH}.backup.bz2`"

I would not like to incorporate this becuase this code assumes WAL archives
are being externally compressed to .bz2. Can you suggest an optimized but
generic alternative ?

>
> | # REF_FILE is 000000010000000A00000068.001D1214.backup.bz2
> |
> | # take only first 24 chars and store.
> | REF_FILE_NUM=${REF_FILE:0:24}
> |
> | # REF_FILE_NUM is 000000010000000A00000068
> |
> | echo "REF_FILE_NUM=$REF_FILE_NUM"
> |
> | # iterate list of files in the WAL_ARCHIVE folder
> | for i in `ls -1 $WAL_ARCHIVE` ;
> | do
> | # $i is :000000010000000A0000005D.bz2 eg
> | # get first 24 chars in filename
> | FILE_NUM=${i:0:24}
> |
> | # compare if the number is less than the reference
> | # here string comparison is being used.
> | if [[ $FILE_NUM < $REF_FILE_NUM ]]
> | then
> | echo "$FILE_NUM [ $i ] removed"
> | rm -f $WAL_ARCHIVE/$i
> | else
> | echo "$FILE_NUM [ $i ] not removed"
> | fi
> | done
>
> (optimization) Perhaps using find -newer/-anewer/-cnewer could be used
> here to find files older than the reference file:
>
> ~ # "-not -newer" or "\! -newer" will also return REF_FILE
> ~ # so you have to grep it out and use xargs; otherwise you
> ~ # could also use the -delete action
> ~ find ${WAL_ARCHIVE} \! -newer ${REF_FILE} -type f | \
> ~ grep -v "^${REF_FILE}$" | \
> ~ xargs rm -f

Nopes , i have gone by the DOCS which tells to numerically compare the filenames
i do not want to assume more recently created files are numerically more that
later as i have not seen anything like that in the docs.

I am concerned if the comparison below used in the script

> | if [[ $FILE_NUM < $REF_FILE_NUM ]]

is correct , as it compares strings not numbers , i am assuming that
the results will be same as numerical comparison, as all the filenames
are padded with '0' form the left.

i thank you once again for your comments and shall post the improved version
once my observations are complete.

Regds
Rajesh Kumar Mallah.

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Rajesh Kumar Mallah 2006-04-01 04:28:04 Re: FW: Setting up of PITR system.
Previous Message Joshua D. Drake 2006-04-01 03:32:46 Re: auto vacuuming