pg_basebackup -x stream from the standby gets stuck

From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: pg_basebackup -x stream from the standby gets stuck
Date: 2012-02-07 11:30:56
Message-ID: CAHGQGwFim5F61AfdLQH4PvARPr0Ace2=9QH62khYGraWY4E5TQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

http://www.depesz.com/2012/02/03/waiting-for-9-2-pg_basebackup-from-slave/
> =$ time pg_basebackup -D /home/pgdba/slave2/ -F p -x stream -c fast -P -v -h 127.0.0.1 -p 5921 -U replication
> xlog start point: 2/AC4E2600
> pg_basebackup: starting background WAL receiver
> 692447/692447 kB (100%), 1/1 tablespace
> xlog end point: 2/AC4E2600
> pg_basebackup: waiting for background process to finish streaming...
> pg_basebackup: base backup completed
>
> real 3m56.237s
> user 0m0.224s
> sys 0m0.936s
>
> (time is long because this is only test database with no traffic, so I had to make some inserts for it to finish)

The above article points out the problem of pg_basebackup from the standby:
when "-x stream" is specified, pg_basebackup from the standby gets stuck if
there is no traffic in the database.

When "-x stream" is specified, pg_basebackup forks the background process
for receiving WAL records during backup, takes an online backup and waits for
the background process to end. The forked background process keeps receiving
WAL records, and whenever it reaches end of WAL file, it checks whether it has
already received all WAL files required for the backup, and exits if yes. Which
means that at least one WAL segment switch is required for pg_basebackup with
"-x stream" option to end.

In the backup from the master, WAL file switch always occurs at both start and
end of backup (i.e., in do_pg_start_backup() and do_pg_stop_backup()), so the
above logic works fine even if there is no traffic. OTOH, in the backup from the
standby, while there is no traffic, WAL file switch is not performed at all. So
in that case, there is no chance that the background process reaches end of WAL
file, check whether all required WAL arrives and exit. At the end, pg_basebackup
gets stuck.

To fix the problem, I'd propose to change the background process so that it
checks whether all required WAL has arrived, every time data is received, even
if end of WAL file is not reached. Patch attached. Comments?

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachment Content-Type Size
fix_backup_stuck_v1.patch text/x-diff 7.7 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2012-02-07 11:34:38 Re: controlling the location of server-side SSL files
Previous Message Pavan Deolasee 2012-02-07 11:21:07 Re: Assertion failure in AtCleanup_Portals