cross-platform pg_basebackup

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: cross-platform pg_basebackup
Date: 2022-10-20 15:11:17
Message-ID: CA+TgmoY+jC3YiskomvYKDPK3FbrmsDU7_8+wMHt02HOdJeRb0g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Suppose that, for some reason, you want to use pg_basebackup on a
Linux machine to back up a database cluster on a Windows machine.
Suppose further that you attempt to use the -T option. Then you might
run afoul of this check:

/*
* This check isn't absolutely necessary. But all tablespaces are created
* with absolute directories, so specifying a non-absolute path here would
* just never match, possibly confusing users. It's also good to be
* consistent with the new_dir check.
*/
if (!is_absolute_path(cell->old_dir))
pg_fatal("old directory is not an absolute path in tablespace
mapping: %s",
cell->old_dir);

The problem is that the definition of is_absolute_path() here differs
depending on whether you are on Windows or not. So this code is, I
think, subtly incorrect. What it is testing is whether the
user-specified pathname is an absolute pathname *on the local machine*
whereas what it should be testing is whether the user-specified
pathname is an absolute pathname *on the remote machine*. There's no
problem if both sides are Windows or neither side is Windows, but if
the remote side is and the local side isn't, then something like
-TC:\foo=/backup/foo will fail. As far as I know, there's no reason
why that shouldn't be permitted to work.

What this check is actually intending to prevent, I believe, is
something like -T../mytablespace=/bkp/ts1, because that wouldn't
actually work: the value in the list will be an absolute path. The
tablespace wouldn't get remapped, and the user might be confused about
why it didn't, so it is good that we tell them what they did wrong.
However, I think we could relax the check a little bit, something
along the lines of !is_nonwindows_absolute_path(cell->old_dir) &&
!is_windows_absolute_path(dir). We can't actually know whether the
remote side is Windows or non-Windows, but if the string we're given
is plausibly an absolute path under either set of conventions, it's
probably fine to just search the list for it and see if it shows up.

This would have the disadvantage that if a Linux user creates a
tablespace directory inside $PGDATA and gives it a name like
/home/rhaas/pgdata/C:\Program Files\PostgreSQL\Data, and then attempts
a backup with '-TC:\Program Files\PostgreSQL\Data=/tmp/ts1' it will
not relocate the tablespace, yet the user won't get a message
explaining why. I'm prepared to dismiss that scenario as "not a real
use case".

Thoughts?

--
Robert Haas
EDB: http://www.enterprisedb.com

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2022-10-20 16:17:12 Re: cross-platform pg_basebackup
Previous Message Japin Li 2022-10-20 14:43:54 Re: date_part/extract parse curiosity