Re: Adding pipe support to pg_dump and pg_restore

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: David Hedberg <david(dot)hedberg(at)gmail(dot)com>
Cc: David Fetter <david(at)fetter(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Adding pipe support to pg_dump and pg_restore
Date: 2018-09-29 18:03:32
Message-ID: 20180929180332.GR4184@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Greetings,

* David Hedberg (david(dot)hedberg(at)gmail(dot)com) wrote:
> On Sat, Sep 29, 2018 at 7:01 PM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> > * David Hedberg (david(dot)hedberg(at)gmail(dot)com) wrote:
> >> On Sat, Sep 29, 2018 at 5:03 PM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> >> Generally, my thinking is that this can be pretty useful in general
> >> besides encryption. For other formats the dumps can already be written
> >> to standard output and piped through for example gpg or a custom
> >> compression application of the administrators choice, so in a sense
> >> this functionality would merely add the same feature to the directory
> >> format.
> >
> > That's certainly not the same though. One of the great advantages of
> > custom and directory format dumps is the TOC and the ability to
> > selectively extract data from them without having to read the entire
> > dump file. You end up losing that if you have to pass the entire dump
> > through something else because you're using the pipe.
>
> I can maybe see the problem here, but I apologize if I'm missing the point.
>
> Since all the files are individually passed through separate instances
> of the pipe, they can also be individually restored. I guess the
> --list option could be (adopted to be) used to produce a clear text
> TOC to further use in selective decryption of the rest of the archive?

This can work for directory format, but it wouldn't work for custom
format. For a custom format dump, we'd need a way to encrypt the TOC
independently of the rest, and we might even want to have the TOC
include individual keys for the different objects or similar.

> Possibly combined with an option to not apply the pipeline commands to
> the TOC during dump and/or restore, if there's any need for that.

That certainly doesn't seem to make things simpler or to be a very good
interface.

> But I think the pipe option, or one like it, could be used to easily
> extend the format. Easily supporting a different compression
> algorithm, a different encryption method or even a different storage
> method like uploading the files directly to a bucket in S3. In this
> way I think that it's similar to be able to write the other formats to
> stdout; there are probably many different usages of it out there,
> including custom compression or encryption.

Considering the difficulty in doing selective restores (one of the
primary reasons for doing a logical dump at all, imv) from a dump file
that has to be completely decrypted or decompressed (due to using a
custom compression method), I don't know that I really buy off on this
argument that it's very commonly done or that it's a particularly good
interface to use.

> If this is simply outside the scope of the directory or the custom
> format, that is certainly understandable (and, to me, somewhat
> regrettable :-) ).

What I think isn't getting through is that while this is an interesting
approach, it really isn't a terribly good one, regardless of how
flexible you view it to be. The way to move this forward seems pretty
clearly to work on adding generalized encryption support to
pg_dump/restore that doesn't depend on calling external programs
underneath of the directory format with a pipe.

Thanks!

Stephen

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Marco Atzeri 2018-09-29 18:13:25 Re: Cygwin linking rules
Previous Message Andres Freund 2018-09-29 18:01:08 Re: Adding pipe support to pg_dump and pg_restore