Re: BUG #4566: pg_stop_backup() reports incorrect STOP WAL LOCATION

From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Randy Isbell <jisbell(at)cisco(dot)com>, pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #4566: pg_stop_backup() reports incorrect STOP WAL LOCATION
Date: 2009-01-15 14:25:29
Message-ID: 496F4759.6060203@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-docs pgsql-hackers

Looking at the original post again:

> The resulting *.backup file:
>
> START WAL LOCATION: 10/FE1E2BAC (file 0000000200000010000000FE)
> STOP WAL LOCATION: 10/FF000000 (file 0000000200000010000000FF)
> CHECKPOINT LOCATION: 10/FE1E2BAC
> START TIME: 2008-11-09 01:15:06 CST
> LABEL: /bck/db/sn200811090115.tar.gz
> STOP TIME: 2008-11-09 01:15:48 CST
>
> In my 8.3.4 instance, WAL file naming occurs as:
>
> ...
> 0000000100000003000000FD
> 0000000100000003000000FE
> 000000010000000400000000
> 000000010000000400000001
> ...
>
> WAL files never end in 'FF'. This causes a problem when trying to collect
> the ending WAL file for backup.

I can see the potential confusion here. START WAL LOCATION is an
inclusive value, while STOP WAL LOCATION is exclusive. You need to
archive all WAL files < STOP WAL LOCATION to have a valid backup, not
<=. Printing the filenames adds to the confusion.

Perhaps if we printed them like "files 0000000200000010000000FE <= X <
0000000200000010000000FF" the intention would be clearer, but we can't
change the format now without braking all existing backups.

In 8.4, this will be less of an issue, because pg_stop_backup() now
waits for the last file to be archived before returning, so you don't
have to look at those values to implement the waiting yourself.

In the passing, I notice that the manual says for pg_xlog_switch():

> pg_switch_xlog moves to the next transaction log file, allowing the current file to be archived (assuming you are using continuous archiving). The result is the ending transaction log location within the just-completed transaction log file. If there has been no transaction log activity since the last transaction log switch, pg_switch_xlog does nothing and returns the end location of the previous transaction log file.

That's incorrect. According comments in RequestXLogSwitch(), what it
actually returns is:

> * The return value is either the end+1 address of the switch record,
> * or the end+1 address of the prior segment if we did not need to
> * write a switch record because we are already at segment start.

Note that "end+1 address of the prior segment" is the same as "first
byte of the *next* segment", which contradicts with the manual. I'll
change that paragraph in the manual into:

The result is the ending transaction log location *+ 1* within the
just-completed transaction log file.
If there has been no transaction log activity since the last
transaction log switch,
<function>pg_switch_xlog</> does nothing and returns the *start*
location
of the transaction log file *currently in use*.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Heikki Linnakangas 2009-01-15 15:23:46 Re: BUG #4566: pg_stop_backup() reports incorrect STOP WAL LOCATION
Previous Message Fujii Masao 2009-01-15 14:15:55 Re: BUG #4566: pg_stop_backup() reports incorrect STOP WAL LOCATION

Browse pgsql-docs by date

  From Date Subject
Next Message Heikki Linnakangas 2009-01-15 15:23:46 Re: BUG #4566: pg_stop_backup() reports incorrect STOP WAL LOCATION
Previous Message Fujii Masao 2009-01-15 14:15:55 Re: BUG #4566: pg_stop_backup() reports incorrect STOP WAL LOCATION

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2009-01-15 14:51:00 Re: tuplestore potential performance problem
Previous Message Fujii Masao 2009-01-15 14:15:55 Re: BUG #4566: pg_stop_backup() reports incorrect STOP WAL LOCATION