Re: Allow replication roles to use file access functions

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)anarazel(dot)de>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Allow replication roles to use file access functions
Date: 2015-09-03 12:53:20
Message-ID: 20150903125320.GB3685@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

* Michael Paquier (michael(dot)paquier(at)gmail(dot)com) wrote:
> On Thu, Sep 3, 2015 at 11:20 AM, Stephen Frost wrote:
> >> Not only, +clog, configuration files, etc.
> >
> > Configuration files? Perhaps you could elaborate?
>
> Sure. Sorry for being unclear. It copies everything that is not a
> relation file, a kind of base backup without the relation files then.

How does that work on systems where the configuration files aren't
stored under PGDATA (Debian and derivatives, at least)? I guess I don't
quite see why it's necessary for pg_rewind to copy the configuration
files in the first place, it doesn't have the same role as
pg_basebackup, at least as I understand it.

> >> The problem when using differential backups in this case is
> >> performance as mentioned above. We would need to scan the whole target
> >> cluster, which may take time, the current approach of pg_rewind only
> >> needs to scan WAL records to find the list of blocks modified, and
> >> directly requests them from the source. I would expect pg_rewind to be
> >> as quick as possible.
> >
> > I don't follow why the current approach of pg_rewind would have to
> > change. All I'm suggesting is that we have a different way, one which
> > is much more restricted, for pg_rewind to request exactly the
> > information it needs for efficient operation.
>
> Ah, OK. I thought that you were referring to a protocol where caller
> sends a single LSN from which it gets a differential backup that needs
> to scan all the relation files of the source cluster to get the data
> blocks with an LSN newer than the one sent, and then sends them back
> to the caller.

No, apologies, I was simply pointing out that we might want this kind of
a capability at the protocol level to support other replication protocol
clients.

> I guess that what you are suggesting instead is an approach where
> caller sends something like that through the replication protocol with
> a relation OID and a block list:
> BLOCK_DIFF relation_oid BLOCK_LIST m,n,[o, ...]

Right, something along those lines is what I had been thinking. We
would probably need to provide independent commands for the different
file types, with the parameters expressed in terms appropriate for each
kind of file (block numbers for heap, XIDs for WAL and CLOG?).
Essentially, whatever API would be both simple for pg_rewind and general
enough to be useful for other clients in the future. At least, I
imagine that pg_rewind would be a bit simpler if it could communicate
with the backend in the 'language of PG' rather than having to specify
file names and paths.

Other clients that might find such an interface useful are incremental
pg_basebackup or possibly parallel pg_basebackup.

> Which is close to what pg_read_binary_file does now for a superuser.

I really don't see them as being all that close. Further, I worry a bit
that users would abuse the replication role to grant access to these
functions for non-superusers to be able to access non-PG files (but ones
which happen to be under PGDATA, or through a symlink are somewhere
else..).

> We would need as well to extend BASE_BACKUP so as it does not include
> relation files though for this use case.

... huh? I'm not following this comment at all. We might need to
provide explicit start/stop backup commands and/or extend BASE_BACKUP
for things like parallel pg_basebackup, but I'm not following why we
would need to change it for pg_rewind. Further BASE_BACKUP clearly does
include relation files today..

Thanks!

Stephen

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2015-09-03 13:03:44 Re: max_worker_processes on the standby
Previous Message Fujii Masao 2015-09-03 12:43:24 Re: Re: [COMMITTERS] pgsql: Map basebackup tablespaces using a tablespace_map file