Re: problems using pg_start_backup/pg_stop_backup and pg_basebackup at same time

From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Alex Malek <magicagent(at)gmail(dot)com>
Cc: pgsql-admin <pgsql-admin(at)postgresql(dot)org>
Subject: Re: problems using pg_start_backup/pg_stop_backup and pg_basebackup at same time
Date: 2016-06-20 08:17:57
Message-ID: CABUevEwZ0Zfiff0=Bq2WaW-R-=1gwXPUd0A-eMHV3Gs7THicaQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

On Mon, Jun 13, 2016 at 6:59 PM, Alex Malek <magicagent(at)gmail(dot)com> wrote:

>
> I am experiencing two problems
> 1) pg_stop_backup hangs until pg_basebackup finishes
> 2) WAL contains references to invalid pages
>
> (I suspect when 1. is fixed I won't experience 2. )
>
> Full description below:
>
>
> postgresql: 9.3.13
>
> My situation is I create a writable report database every day using
> pg_start_backup() / rsync / pg_stop_backup()
> (whole process takes 1-2 hours, the rsync actually copies the data dir of
> a slave/warm spare DB)
>

Taking what is called an exclusive backup, which is what pg_start_backup()
does, is not supported off a standby node. Only pg_basebackup is supported
against a standby node.

Can you reproduce your errors if you make this backup from the master?

> Also once a week I create a database for backup/archive purposes using
> pg_basebackup
> (whole process takes about 13 hours)
>
>
> These two processes used to be able to coincide until recently.
> Recent changes include a major debian upgrade and a minor version of
> postgres upgrade from 9.3.10.
>
> Now when the report sync occurs during the pg_basebackup (w/
> --xlog-method=stream option) the pg_stop_backup() hangs until
> the the pg_basebackup has completed (4 hours later).
>

Do you also have an archive_command, and what is it? Normally,
pg_stop_backup() blocks on the archive command - so perhaps you have ended
up with a dependency between that one and the base backup command somehow?

> A labeled WAL backup file is created when the pg_stop_backup() is first
> executed and another
> is created when the pg_basebackup completes.
>
>
>
> While the pg_stop_backup() hangs the following appears in the logs:
>
> 2016-06-11 07:50:45 EDT: WARNING: pg_stop_backup still waiting for all
> required WAL segments to be archived (7680 seconds elapsed)
> 2016-06-11 07:50:45 EDT: HINT: Check that your archive_command is
> executing properly. pg_stop_backup can be canceled safely, but the
> database backup will not be usable without all the WAL segments.
>
>
> until eventually:
>
> 2016-06-11 12:59:07 EDT:LOG: duration: 26190082.497 ms statement: SELECT
> pg_stop_backup()
>
> The archive command appears to work as WAL files are being archived
> successfully with timestamps that occur while pg_stop_backup is hanging.
>
> -rw------- 1 postgres postgres 314 Jun 11 12:56
> 000000030000109200000066.00000028.backup
> -rw------- 1 postgres postgres 16M Jun 11 12:56 000000030000109300000029
> -rw------- 1 postgres postgres 16M Jun 11 12:54 000000030000109300000028
> -rw------- 1 postgres postgres 16M Jun 11 12:52 000000030000109300000027
> <snip>
> -rw------- 1 postgres postgres 16M Jun 11 06:22 0000000300001092000000F1
> -rw------- 1 postgres postgres 335 Jun 11 05:42
> 0000000300001092000000D4.00000028.backup
> -rw------- 1 postgres postgres 16M Jun 11 05:42 0000000300001092000000F0
>
> Content of 0000000300001092000000D4.00000028.backup:
>
> START WAL LOCATION: 1092/D4000028 (file 0000000300001092000000D4)
> STOP WAL LOCATION: 1092/F008EA10 (file 0000000300001092000000F0)
> CHECKPOINT LOCATION: 1092/D4004E08
> BACKUP METHOD: pg_start_backup
> BACKUP FROM: master
> START TIME: 2016-06-11 04:51:04 EDT
> LABEL: reporting 2016-06-11 04:51:03.892804-04
> STOP TIME: 2016-06-11 05:42:37 EDT
>

This file shows that the backup took about an hour, doesn't it? So it
doesn't look like it was waiting?

Or are you saying the wait happens *after* the .backup file has been
dropped in the archives?

Should I be be able to use those two styles of backup at the same time?
>

As a general rule, yes. But you cannot run the start/stop version safely
against a standby.

> Should I replace the start_backup/stop_backup with separate calls to
> create a checkpoint and switch transaction log files?
>
>
That will be equally insecure. You need to run your rsync off the master to
make exclusive base backups.

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

In response to

Browse pgsql-admin by date

  From Date Subject
Next Message Slavcho Trnkovski 2016-06-20 08:38:23 WAL retention
Previous Message hubert depesz lubaczewski 2016-06-17 07:58:22 Re: Adding same column to several tables.