Re: BUG #15638: pg_basebackup with --wal-method=stream incorrectly generates WAL segment created during backup

From: "Maeldron T(dot)" <maeldron(at)gmail(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #15638: pg_basebackup with --wal-method=stream incorrectly generates WAL segment created during backup
Date: 2019-02-16 19:13:32
Message-ID: CAKatfSkD0Q=XLqAi_sPO162qpBNPuOq2aYPBcNA3gWB6VUVpoQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

I did a quick test using near to empty databases. I did not do the promote
but the basebackup with two different methods.

When I did it on MacOS and PostgreSQL 11.1, the .done file existed only
under the data directory I created with -X fetch. The files were the same.

When I did it on FreeBSD and PostgreSQL 10.6, the .done file existed only
under the -X fetch directory, and the wal files were also different. II
don’t know whether it is a problem or not, but I could reproduce it at the
first attempt.

This was after the two basebackups:

$ pg_basebackup -p 5433 -v -R -P -D 1 -X fetch
$ pg_basebackup -p 5433 -v -R -P -D 2 -X stream

$ diff -ur 1/pg_wal/ 2/pg_wal/
Only in 1/pg_wal/: 00000001000000000000000C
Only in 1/pg_wal/: 00000001000000000000000D
Files 1/pg_wal/00000001000000000000000E and
2/pg_wal/00000001000000000000000E differ
Only in 1/pg_wal/archive_status: 00000001000000000000000C.done
Only in 1/pg_wal/archive_status: 00000001000000000000000D.done
Only in 1/pg_wal/archive_status: 00000001000000000000000E.done

$ less log/1/2019-02-16_19-48-29.log
2019-02-16 19:48:29 CET LOG: database system was interrupted; last known
up at 2019-02-16 19:44:45 CET
2019-02-16 19:48:29 CET LOG: entering standby mode
2019-02-16 19:48:29 CET LOG: redo starts at 0/C000028
2019-02-16 19:48:29 CET LOG: consistent recovery state reached at
0/C000130
2019-02-16 19:48:29 CET LOG: database system is ready to accept read only
connections
2019-02-16 19:48:29 CET LOG: started streaming WAL from primary at
0/D000000 on timeline 1

$ less log/2/2019-02-16_19-48-34.log
2019-02-16 19:48:34 CET LOG: database system was interrupted; last known
up at 2019-02-16 19:45:15 CET
2019-02-16 19:48:34 CET LOG: entering standby mode
2019-02-16 19:48:34 CET LOG: redo starts at 0/E000028
2019-02-16 19:48:34 CET LOG: consistent recovery state reached at
0/E000130
2019-02-16 19:48:34 CET LOG: database system is ready to accept read only
connections
2019-02-16 19:48:34 CET LOG: started streaming WAL from primary at
0/F000000 on timeline 1

$ diff -ur 1/base/ 2/base/
Files 1/base/16386/pg_internal.init and 2/base/16386/pg_internal.init differ

I did nothing except for starting the two clusters. There was no activity
on the master. I did not promote.

M.

On Sat, Feb 16, 2019 at 4:25 PM Michael Paquier <michael(at)paquier(dot)xyz> wrote:

> On Sat, Feb 16, 2019 at 12:26:13AM +0000, PG Bug reporting form wrote:
> > When new slave is created by taking base backup from the primary using
> > pg_basebackup with --wal-method=stream option the WAL file generated
> during
> > the backup is different (as compared with diff or cmp command) than that
> on
> > the master and in WAL archive directory. Furthermore, this file does not
> > exist in pg_wal/archive_status with .done extension on new slave, though
> it
> > exists in pg_wal directory, resulting in failed attempt to archive this
> file
> > when slave node is promoted as master node.
> > 2019-02-15 14:15:58.872 PST [5369] DETAIL: The failed archive command
> was:
> > test ! -f /mnt/pgsql/archive/000000010000000000000002 && cp
> > pg_wal/000000010000000000000002
> > /mnt/pgsql/archive/000000010000000000000002
>
> How do you promote your standby? In Postgres 10, the last, partial
> WAL segment of a past timeline generated at promotion is renamed
> .partial to avoid any conflicts, so as this should normally not
> happen if you do not use archive_mode = always.
>
> Please note that your archive command is not safe. For one, it does
> not sync the archived segment before archive_command returns to the
> backend..
> --
> Michael
>

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Hugh Ranalli 2019-02-17 00:51:08 Re: BUG #15548: Unaccent does not remove combining diacritical characters
Previous Message Michael Paquier 2019-02-16 15:25:03 Re: BUG #15638: pg_basebackup with --wal-method=stream incorrectly generates WAL segment created during backup