Re: Proposal: improve shutdown during online backup

From: Greg Smith <gsmith(at)gregsmith(dot)com>
To: Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Proposal: improve shutdown during online backup
Date: 2008-03-26 23:54:50
Message-ID: Pine.GSO.4.64.0803261917180.2012@westnet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, 26 Mar 2008, Albe Laurenz wrote:

> 1) On "pg_ctl stop|restart -m smart", check if online backup is
> in progress and do not shutdown in this case (treat the online
> backup like an open connection).

As long as you give a warning as to the cause. While you're in there, I
think more output in general about the reason why a smart shutdown failed
would be nice as well. I haven't looked at the code to see if it's
practical but I'd love "shutdown blocked by pid 53213,53216" rather than
having to go search for them myself after it quietly fails.

> 2) On "pg_ctl stop|restart -m fast", remove backup_label after
> the server has been brought down successfully.

And you need a warning here as well about this fact. I think the actual
details associated with that label should be both printed and put into the
logs at this time, so you know which backup you just hosed. Maybe the
label file could get renamed instead? Just deleting the file without
saving it somewhere doesn't seem right, that's the sort of thing MySQL
would do. If there's [one|some] of those failed backup logs inside $PGDATA
that gives an additional clue to an admin who doesn't watch that logs that
something is wrong with the backups.

There are three options here for how "-m fast" could handle things:

1) Warning, remove backup label.

2) Warning and server is not stopped. This is unacceptable because too
many scripts expect fast shutdown will usually take the server down (
/etc/init.d/postgresql being the most popular)

3) Server stops but you do get a stern warning that it will not start
again until you remove the backup label yourself--the current behavior
with a warning. The problem with this one is that some shutdowns don't
have any human involvement (again, consider server reboot) and therefore
you can't assume anyone will ever see this message.

If you want to remove the root problem here, you have to follow (1) and
remove the label. Otherwise it's still the case that the person who
starts the database will be surprised if the person stopping it isn't
paying attention (or isn't a person).

--
* Greg Smith gsmith(at)gregsmith(dot)com http://www.gregsmith.com Baltimore, MD

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2008-03-27 00:03:41 Re: [DOCS] pg_total_relation_size() and CHECKPOINT
Previous Message Joshua D. Drake 2008-03-26 22:38:28 Re: Script binaries renaming