Re: where should I stick that backup?

From: Andres Freund <andres(at)anarazel(dot)de>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Bruce Momjian <bruce(at)momjian(dot)us>, Magnus Hagander <magnus(at)hagander(dot)net>, Noah Misch <noah(at)leadboat(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: where should I stick that backup?
Date: 2020-04-13 00:27:50
Message-ID: 20200413002750.sd3k2s3cwcgbvqam@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2020-04-12 20:02:50 -0400, Robert Haas wrote:
> On Sun, Apr 12, 2020 at 3:17 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> > A huge advantage of a scheme like this would be that it wouldn't have to
> > be specific to pg_basebackup. It could just as well work directly on the
> > server, avoiding an unnecesary loop through the network. Which
> > e.g. could integrate with filesystem snapshots etc. Without needing to
> > build the 'archive target' once with server libraries, and once with
> > client libraries.
>
> That's quite appealing. One downside - IMHO significant - is that you
> have to have a separate process to do *anything*. If you want to add a
> filter that just logs everything it's asked to do, for example, you've
> gotta have a whole process for that, which likely adds a lot of
> overhead even if you can somehow avoid passing all the data through an
> extra set of pipes. The interface I proposed would allow you to inject
> very lightweight filters at very low cost. This design really doesn't.

Well, in what you described it'd still be all done inside pg_basebackup,
or did I misunderstand? Once you fetched it from the server, I can't
imagine the overhead of filtering it a bit differently would matter.

But even if, the "target" could just reply with "skip" or such, instead
of providing an fd.

What kind of filtering are you thinking of where this is a problem?
Besides just logging the filenames? I just can't imagine how that's a
relevant overhead compared to having to do things like
'shell ssh rhaas(at)depository pgfile create-exclusive - %f.lz4'

I really think we want the option to eventually do this server-side. And
I don't quite see it as viable to go for an API that allows to specify
shell fragments that are going to be executed server side.

> Note that you could build this on top of what I proposed, but not the
> other way around.

Why should it not be possible the other way round?

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Justin Pryzby 2020-04-13 00:41:54 Re: sqlsmith crash incremental sort
Previous Message Stephen Frost 2020-04-13 00:27:43 Re: where should I stick that backup?