Re: pg_stop_backup does not complete

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Greg Smith <greg(at)2ndquadrant(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_stop_backup does not complete
Date: 2010-02-24 21:52:00
Message-ID: 608.1267048320@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Josh Berkus <josh(at)agliodbs(dot)com> writes:
> Thing is, if archive_command is failing, then the backup is useless
> regardless until it's fixed. And sending the archives to /dev/null (the
> fix you're essentially recommending above) doesn't make the backup any
> more useful. So I'm seeing pg_abort_backup(), which also produces a
> markers which prevent the backup from loading, as an improvement on
> current UI.

On reflection I'm not sure what pg_abort_backup would do for you.
As Heikki points out, by the time the user has realized that
pg_stop_backup() is not completing, it's *already done* all of the
state changes it's going to make. There is no way to take the
backup-complete WAL entry out of the WAL stream; it's already in there
and there's probably ordinary entries after it by now. Having a
oh-the-backup-failed-after-all entry somewhere downstream of that is
entirely useless; the more so because by the time anything could *see*
such an entry, the problem would have been resolved, since the problem
is exactly not having gotten the WAL stream out to the archive.

Before you could enter pg_abort_backup you'd have to control-C out of
the pg_stop_backup call, and that action already accomplishes the only
thing pg_abort_backup could do for you.

So what I am thinking is that this is really just a minor bit of user
unfriendliness in pg_stop_backup. We should address it with one or
both of these changes:

* emit a NOTICE as soon as pg_stop_backup's actual work is done and
it's starting to wait for the archiver (or maybe after it's waited
for a few seconds, but much less than the present 60).

* extend the existing WARNING (and the NOTICE too if we elect to have
one) with a HINT message explicitly saying that you can cancel the
wait but thus-and-such consequences might ensue.

Both of these things would only be helpful when using client software
that shows you received notices promptly. psql is okay, but maybe
pgAdmin and other tools would need some further work. There is not
much we can do about that in the core project though.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2010-02-24 21:55:30 Re: pg_stop_backup does not complete
Previous Message Gokulakannan Somasundaram 2010-02-24 21:46:06 Re: A thought on Index Organized Tables