Re: TODO item: WAL replay of CREATE TABLESPACE with differing directory structure

From: Patrick Krecker <pkrecker(at)gmail(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: TODO item: WAL replay of CREATE TABLESPACE with differing directory structure
Date: 2018-02-19 00:43:38
Message-ID: CACh_hd65-hn1DNqsOV-iqGvCgzZHfYY=fxo9bdLX2yQ-SPg55Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Feb 13, 2018 at 8:24 PM, Michael Paquier <michael(at)paquier(dot)xyz> wrote:
> On Tue, Feb 13, 2018 at 01:44:34PM -0800, Patrick Krecker wrote:
>> I am searching for a way to make a contribution to Postgres and I came
>> across this TODO item (I realize there has been some controversy
>> around the TODO list [1], and I hope that my use of it doesn't spark
>> another discussion about removing it altogether):
>
> Well, it will point out again that TODO items are hard, complicated and
> mostly impossible projects.
>
>> "Allow WAL replay of CREATE TABLESPACE to work when the directory
>> structure on the recovery computer is different from the original"
>>
>> Currently it looks like tablespaces have to live inside the data
>> directory on the replica, notwithstanding administrator intervention
>> by manipulating the tablespace directory with symlinks after (or even
>> before?) it has been created via replay.
>
> Let's be clear here. There is no hard restriction with tablespace paths
> within the data directory, though you should not do that, and you get a
> nice warning when trying to do so with CREATE TABLESPACE (see 33cb8ff6).
> This also causes pg_basebackup to fail. It is also bad design to create
> tablespaces within the data directory as those are aimed at making hot
> paths work on different partitions with different I/O properties.

Sorry, my language was imprecise here. What I meant is that the
pg_tablespace directory contains no symlinks when a tablespace
creation is streamed to a replica, i.e. the data files reside within
pg_tablespace on the replica.

>> Is the idea behind this task to allow the master to instruct the
>> replica where to put the tablespace on its filesystem, so as to allow
>> it to live outside of the data directory without direct manipulation
>> of the filesystem?
>
> WAL records associated to CREATE TABLESPACE (xl_tblspc_create_rec)
> register the location where a tablespace is located. The location of a
> tablespace is not saved in the system catalogs, which offers flexibility
> in the way the symlink from pg_tblspc can be handled. This is where the
> tablespace path remapping of pg_basebackup becomes handy, because you
> can repurpose paths easily when taking a base backup, but this forces
> you to create tablespaces first, and then create standbys. We have also
> a set of existing problems:
> 1) If a primary and its standby are on the same server and you issue a
> CREATE TABLESPACE, then they would try to write to the same paths.
> 2) How do we design at DDL level a command which allows for specifying
> different paths depending on the node where the recovery happens.
>
> You would need in both cases a sort of ability to define a node name, so
> as for 1) you append the node name to the path and both primary and
> standby can use the same tablespace path, but with different sub-paths.
> And for 2), you can enforce a patch name by defining as well a path
> associated to a node name so as when xl_tblspc_create_rec records are
> replayed at recovery, you know which path to create. Just designing
> that the right way as its own set of complications.
>
>> If this task is a worthwhile endeavor, I would be happy to take it on.
>> If not, I am open to other ideas :)
>
> This is part of the difficult, perhaps-not-worth doing impossible
> problems. As a first contribution, you may want something easier.

Thank you for the response. I would suggest that we link to it from
the wiki so as to provide clarification to future readers of the todo
list.

> --
> Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2018-02-19 00:48:40 Re: ALTER TABLE ADD COLUMN fast default
Previous Message Michael Paquier 2018-02-19 00:39:50 Re: [HACKERS] Cache lookup errors with functions manipulation object addresses