Re: basebackups during ALTER DATABASE ... SET TABLESPACE ... not safe?

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: basebackups during ALTER DATABASE ... SET TABLESPACE ... not safe?
Date: 2015-01-27 19:16:43
Message-ID: 20150127191643.GB3362@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2015-01-27 07:16:27 -0500, Robert Haas wrote:
> On Mon, Jan 26, 2015 at 4:03 PM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> >> I basically have two ideas to fix this.
> >>
> >> 1) Make do_pg_start_backup() acquire a SHARE lock on
> >> pg_database. That'll prevent it from starting while a movedb() is
> >> still in progress. Then additionally add pg_backup_in_progress()
> >> function to xlog.c that checks (XLogCtl->Insert.exclusiveBackup ||
> >> XLogCtl->Insert.nonExclusiveBackups != 0). Use that in createdb() and
> >> movedb() to error out if a backup is in progress.
> >
> > Attached is a patch trying to this. Doesn't look too bad and lead me to
> > discover missing recovery conflicts during a AD ST.
> >
> > But: It doesn't actually work on standbys, because lock.c prevents any
> > stronger lock than RowExclusive from being acquired. And we need need a
> > lock that can conflict with WAL replay of DBASE_CREATE, to handle base
> > backups that are executed on the primary. Those obviously can't detect
> > whether any standby is currently doing a base backup...
> >
> > I currently don't have a good idea how to mangle lock.c to allow
> > this. I've played with doing it like in the second patch, but that
> > doesn't actually work because of some asserts around ProcSleep - leading
> > to locks on database objects not working in the startup process (despite
> > already being used).
> >
> > The easiest thing would be to just use a lwlock instead of a heavyweight
> > lock - but those aren't canceleable...
>
> How about just wrapping an lwlock around a flag variable? movedb()
> increments the variable when starting and decrements it when done
> (must use PG_ENSURE_ERROR_CLEANUP). Starting a backup errors out (or
> waits in 1-second increments) if it's non-zero.

That'd end up essentially being a re-emulation of locks. Don't find that
all that pretty. But maybe we have to go there.

Here's an alternative approach. I think it generally is superior and
going in the right direction, but I'm not sure it's backpatchable.

It basically consists out of:
1) Make GetLockConflicts() actually work.
2) Allow the startup process to actually acquire locks other than
AccessExclusiveLocks. There already is code acquiring other locks,
but it's currently broken because they're acquired in blocking mode
which isn't actually supported in the startup mode. Using this
infrastructure we can actually fix a couple small races around
database creation/drop.
3) Allow session locks during recovery to be heavier than
RowExclusiveLock - since relation/object session locks aren't ever
taken out by the user (without a plain lock first) that's safe.
4) Perform streaming base backups from within a transaction command, to
provide error handling.
5) Make walsender actually react to recovery conflict interrupts
6) Prevent access to the template database of a CREATE DATABASE during
WAL replay.
7) Add an interlock to prevent base backups and movedb() to run
concurrently. What we actually came here for.

Comments?

At the very least it's missing some documentation updates about the
locking changes in ALTER DATABASE, CREATE DATABASE and the base backup
sections.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
0001-Fix-various-issues-around-WAL-replay-and-ALTER-DATAB.patch text/x-patch 28.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2015-01-27 19:17:44 Re: jsonb, unicode escapes and escaped backslashes
Previous Message Pavel Stehule 2015-01-27 18:58:42 Re: proposal: row_to_array function