Re: [PATCH] Relocation of tablespaces in pg_basebackup

From: Steeve Lennmark <steevel(at)handeldsbanken(dot)se>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Andreas Karlsson <andreas(at)proxel(dot)se>, Magnus Hagander <magnus(at)hagander(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] Relocation of tablespaces in pg_basebackup
Date: 2014-01-23 10:01:55
Message-ID: CADAK8w42stKsNtU_PFMUqLZQdbgF6UtQ1z6_CRQWObR76R2P=A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Peter,
On Thu, Jan 23, 2014 at 2:06 AM, Peter Eisentraut <peter_e(at)gmx(dot)net> wrote:
>
> I'm tempted to think it should be mandatory to specify this option in
> plain mode when tablespaces are present. Otherwise, creating a base
> backup is liable to create random files all over the place. Obviously,
> there would be backward compatibility concerns.
>

That was my initial thought too, the one thing that speaks FOR a change
in behaviour is that there isn't a lot of people who have moved over to
pg_basebackup yet and even fewer who use multiple tablespaces. For me
at least pg_basebackup isn't an option at the moment since I use more
than one tablespace.

I'm not totally happy with the choice of ":" as the mapping separator,
> because that would always require escaping on Windows. We could make it
> analogous to the path handling and make ";" the separator on Windows.
> Then again, this is not a path, so maybe it should look like one. We
> pick something else altogether, like "=".
>
> The option name "--tablespace" isn't very clear. It ought to be
> something like "--tablespace-mapping".
>

This design choice I made about -T (--tablespace) and colon as
separator was copied from pg_barman, which is the way I back up my
clusters at the moment. Renaming the option to --tablespace-mapping and
changing the syntax to something like "=" is totally fine by me.

> I don't think we should require the new directory to be an absolute
> path. It should be relative to the current directory, just like the -D
> option does it.
>

Accepting a relative path should be fine, I made a failed attempt using
realpath(3) initially but I guess checking for [0] != '/' and
prepending getcwd(3) would suffice.

I would try to write this patch without using MAXPGPATH. I know
> existing code uses it, but we should try to use it less, because it
> overallocates memory and requires handling additional error conditions.
> For example, you catch overflow in tablespace_list_append() but later
> report that as invalid tablespace format. We now have psprintf() to
> make coding with dynamic memory allocation easier.
>

Is overallocating memory in a cli application really an issue though? I
will obviously rewrite the code to fix that if necessary.

It looks like when you ignore an escaped ":" as a separator, you don't
> actually unescape it for use as a directory.
>

+ if (*arg == '\\' && *(arg + 1) == ':')
+ ;

This code handles that case, I could try to make that cleaner.

> Somehow, I had the subconscious assumption that this feature would do
> prefix matching on the directories, not only complete string matching.
> So if I had tablespaces in /mnt/data1, /mnt/data2, /mnt/data3, I could
> map them all with -T /mnt:mnt and be done. Not sure if that is useful
> to many, but it's worth a thought.
>

I like that a lot, but I'm afraid the code would have to get a bit more
complex to add that functionality. It would be an easier rewrite if we
added a hint character, something like -T '/mnt/*:mnt'.

> Review style guide for error messages:
> http://www.postgresql.org/docs/devel/static/error-style-guide.html

I will do that.

We need to think about how to handle this on platforms without symlinks.
> I don't like just printing an error message and moving on. It should be
> either pass or fail or an option to choose between them.
>

I hope someone with experience with those kind of systems can come up
with suggestions on how to solve that. I only run postgres on Linux.

>
> Please put the options in the getopt handling in alphabetical order.
> It's not very alphabetical now, but between D and F is probably not the
> best place. ;-)
>

Done.

//Steeve

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dean Rasheed 2014-01-23 10:06:25 Re: WIP patch (v2) for updatable security barrier views
Previous Message Christian Kruse 2014-01-23 09:15:11 Re: Patch: Show process IDs of processes holding a lock; show relation and tuple infos of a lock to acquire