Re: fix tablespace handling in pg_combinebackup

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: fix tablespace handling in pg_combinebackup
Date: 2024-04-18 13:03:21
Message-ID: CA+Tgmob3agKG_VM8Ce6dJr1shiso3eC8rbWqkAA_Y0OXW629oA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Apr 17, 2024 at 5:50 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> > +If there are tablespace present in the backup, include tablespace_map as
> > +a keyword parameter whose values is a hash. When tar_program is used, the
> > +hash keys are tablespace OIDs; otherwise, they are the tablespace pathnames
> > +used in the backup. In either case, the values are the tablespace pathnames
> > +that should be used for the target cluster.
>
> Where would one get these oids?

You pretty much have to pick them out of the tar file names. It sucks,
but it's not this patch's fault. That's just how pg_basebackup works.
If you do a directory format backup, you can use -T to relocate
tablespaces on the fly, using the pathnames from the origin server.
That's a weird convention, and we probably should have based on the
tablespace names and not exposed the server pathnames to the client at
all, but we didn't. But it's still better than what happens when you
do a tar-format backup. In that case you just get a bunch of $OID.tar
files. No trace of the server pathnames remains, and the only way you
could learn the tablespace names is if you rooted through whatever
file contains the contents of the pg_tablespace system catalog. So
you've just got a bunch of OID-named things and it's all up to you to
figure out which one is which and what to put in the tablespace_map
file. I'd call this terrible UI design, but I think it's closer to
absence of UI design.

I wonder if we (as a project) would consider a patch that redesigned
this whole mechanism. Produce ${TABLESPACE_NAME}.tar in tar-format,
instead of ${OID}.tar. In directory-format, relocate via
-T${TABLESPACE_NAME}=${DIR} instead of -T${SERVERDIR}=${DIR}. That
would be a significant compatibility break, and you'd somehow need to
solve the problem of what to put in the tablespace_map file, which
requires OIDs. But it seems like if you could finesse that issue in
some elegant way, the result would just be a heck of a lot more usable
than what we have today.

> Could some of this be simplified by using allow_in_place_tablespaces instead?
> Looks like it'd simplify at least the extended test somewhat?

I don't think we can afford to assume that allow_in_place_tablespaces
doesn't change the behavior. I said (at least off-list) when that
feature was introduced that there was no way it was going to remain an
isolated development hack, and I think that's proved to be 100%
correct. We keep needing code to support it in more places, and I
expect that to continue. Probably we're going to need to start testing
everything both ways, which I think was a pretty predictable result of
introducing it in the first place.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message jian he 2024-04-18 13:28:00 Re: documentation structure
Previous Message Marcel Hofstetter 2024-04-18 12:57:14 Re: Solaris tar issues, or other reason why margay fails 010_pg_basebackup?