Re: security labels on databases are bad for dump & restore

From: Noah Misch <noah(at)leadboat(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Subject: Re: security labels on databases are bad for dump & restore
Date: 2015-07-30 06:49:41
Message-ID: 20150730064941.GA1582735@tornado.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jul 29, 2015 at 10:50:53AM -0400, Robert Haas wrote:
> On Wed, Jul 29, 2015 at 12:39 AM, Noah Misch <noah(at)leadboat(dot)com> wrote:
> > On Tue, Jul 28, 2015 at 03:36:13PM -0400, Robert Haas wrote:
> >> On Tue, Jul 28, 2015 at 3:33 PM, Andres Freund <andres(at)anarazel(dot)de> wrote:
> >> > Hm? Let me try again: If the admin does a ALTER DATABASE ... SET guc =
> >> > ... *before* restoring a backup and the backup does contain a setting
> >> > for the same guc, but with a different value it'll overwrite the
> >> > previous explicit action by the DBA without any warning. If the backup
> >> > does *not* contain that guc the previous action survives. That's
> >> > confusing, because you're more likely to be in the 'the backup does not
> >> > contain the guc' situation when testing where it thus will work.
> >>
> >> True. But I don't think modifying a database before restoring into it
> >> is terribly supported. Even pg_dump --clean, which is supposed to do
> >> this sort of thing, doesn't seem to work terribly reliably. We could
> >> try to fix this by having a command like ALTER DATABASE ... RESET ALL
> >> that we issue before restoring the settings, but I'm afraid that will
> >> take us into all sorts of unreasonable scenarios that are better just
> >> labeled as "don't do that".
> >
> > Andres's example is a harbinger of the semantic morass ahead. Excepting
> > database objects and the "public" schema object, pg_dump and pg_dumpall mutate
> > only the objects they CREATE. They consistently restore object properties
> > (owner, ACLs, security label, etc.) if and only if issuing a CREATE statement
> > for the object. For example, restoring objects contained in a schema without
> > restoring the schema itself changes none of those schema properties. pg_dump
> > and pg_dumpall have mostly followed that rule for databases, too, but they
> > depart from it for comment and security label. That was a mistake. We can't
> > in general mutate an existing database to match, because we can't mutate the
> > encoding, datcollate or datctype. Even discounting that problem, I value
> > consistency with the rest of the dumpable object types.
>
> What we've proven so far (if Craig's comments are to be believed) is
> that the oft-recommended formula of pg_dumpall -g plus pg_dump of each
> database doesn't completely work. That's absolutely gotta be fixed.

What exact formula did you have in mind? It must not be merely

1. "pg_dumpall -g"
2. "pg_dump" (without --create) per database

which _never_ works: it emits no CREATE DATABASE statements. Perhaps this?

1. "pg_dumpall -g"
2. Issue a handwritten CREATE DATABASE statement per database with correct
encoding, lc_ctype and lc_collate parameters. All other database
properties can be wrong; the dump will fix them.
3. "pg_dump" (without --create) per database

That neglects numerous database properties today, but we could make it work.
Given the problems I described upthread, it's an inferior formula that I
recommend against propping up. I much prefer making this work completely:

1. "pg_dumpall -g"
2. "pg_dump --create" per database

Another formula I wouldn't mind offering:

1. "pg_dumpall -g"
2. pg_dumpall --empty-databases
3. "pg_dump" (without --create) per database

Code for an --empty-databases option already exists for "pg_dumpall -g
--binary-upgrade". A patch turning that into a user-facing feature might be
quite compact. I don't see much point given a complete "pg_dump --create",
but I wouldn't object.

Thanks,
nm

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2015-07-30 07:00:11 Re: Using quicksort and a merge step to significantly improve on tuplesort's single run "external sort"
Previous Message Tatsuo Ishii 2015-07-30 06:03:30 Updatable view?