Re: Usability improvements for pg_stop_backup()

From: Kevin Grittner <kgrittn(at)ymail(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgreSQL(dot)org>
Subject: Re: Usability improvements for pg_stop_backup()
Date: 2014-08-03 15:55:47
Message-ID: 1407081347.79376.YahooMailNeo@web122303.mail.ne1.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Josh Berkus <josh(at)agliodbs(dot)com> wrote:

> Currently, if archive_command is failing, pg_stop_backup() will hang
> forever.  The only way to figure out what's wrong with pg_stop_backup()
> is to tail the PostgreSQL logs.  This is difficult for users to
> troubleshoot, and strongly resists any kind of automation.

That is bad.

> Yes, we can work around this by setting statement_timeout, but that has
> two issues (a) the user has to remember to do it before the problem
> occurs, and (b) it won't differentiate between archive failure and other
> reasons it might time out.

Clearly not a long-term solution.

> As such, I propose that pg_stop_backup() should error with an
> appropriate error message ("Could not archive WAL segments") after
> three
> archiving attempts.  We could also add an optional parameter to raise
> the number of attempts from the default of three.

That sounds sane to me.

> An alternative, if we were doing this from scratch, would be for
> pg_stop_backup to return false or -1 or something if it couldn't
> archive; there are reasons why a user might not care that
> archive_command was failing (shared storage comes to mind).  However,
> that would be a surprising break with backwards compatability, since
> currently users don't check the result value of pg_stop_backup().

Some might, which is a stronger argument against changing what get
returned.  Even in a green field though, I would argue that
pg_stop_backup() should return information about the minimum range
of WAL files needed to perform a consistent recovery -- or possibly
duplicate everything in the backup history file.  An error seems
much more appropriate to indicate that the user does not have a
valid backup.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Gavin Flower 2014-08-03 19:29:25 Re: Proposed changing the definition of decade for date_trunc and extract
Previous Message Emre Hasegeli 2014-08-03 13:48:09 Re: KNN-GiST with recheck