|From:||Michael Paquier <michael(at)paquier(dot)xyz>|
|To:||Patrick Krecker <pkrecker(at)gmail(dot)com>|
|Cc:||PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>|
|Subject:||Re: TODO item: WAL replay of CREATE TABLESPACE with differing directory structure|
|Views:||Raw Message | Whole Thread | Download mbox | Resend email|
On Tue, Feb 13, 2018 at 01:44:34PM -0800, Patrick Krecker wrote:
> I am searching for a way to make a contribution to Postgres and I came
> across this TODO item (I realize there has been some controversy
> around the TODO list , and I hope that my use of it doesn't spark
> another discussion about removing it altogether):
Well, it will point out again that TODO items are hard, complicated and
mostly impossible projects.
> "Allow WAL replay of CREATE TABLESPACE to work when the directory
> structure on the recovery computer is different from the original"
> Currently it looks like tablespaces have to live inside the data
> directory on the replica, notwithstanding administrator intervention
> by manipulating the tablespace directory with symlinks after (or even
> before?) it has been created via replay.
Let's be clear here. There is no hard restriction with tablespace paths
within the data directory, though you should not do that, and you get a
nice warning when trying to do so with CREATE TABLESPACE (see 33cb8ff6).
This also causes pg_basebackup to fail. It is also bad design to create
tablespaces within the data directory as those are aimed at making hot
paths work on different partitions with different I/O properties.
> Is the idea behind this task to allow the master to instruct the
> replica where to put the tablespace on its filesystem, so as to allow
> it to live outside of the data directory without direct manipulation
> of the filesystem?
WAL records associated to CREATE TABLESPACE (xl_tblspc_create_rec)
register the location where a tablespace is located. The location of a
tablespace is not saved in the system catalogs, which offers flexibility
in the way the symlink from pg_tblspc can be handled. This is where the
tablespace path remapping of pg_basebackup becomes handy, because you
can repurpose paths easily when taking a base backup, but this forces
you to create tablespaces first, and then create standbys. We have also
a set of existing problems:
1) If a primary and its standby are on the same server and you issue a
CREATE TABLESPACE, then they would try to write to the same paths.
2) How do we design at DDL level a command which allows for specifying
different paths depending on the node where the recovery happens.
You would need in both cases a sort of ability to define a node name, so
as for 1) you append the node name to the path and both primary and
standby can use the same tablespace path, but with different sub-paths.
And for 2), you can enforce a patch name by defining as well a path
associated to a node name so as when xl_tblspc_create_rec records are
replayed at recovery, you know which path to create. Just designing
that the right way as its own set of complications.
> If this task is a worthwhile endeavor, I would be happy to take it on.
> If not, I am open to other ideas :)
This is part of the difficult, perhaps-not-worth doing impossible
problems. As a first contribution, you may want something easier.
|Next Message||Tsunakawa, Takayuki||2018-02-14 04:37:05||[bug fix] Cascaded standby cannot start after a clean shutdown|
|Previous Message||Tom Lane||2018-02-14 04:10:27||Re: tapeblocks is uninitialized in logtape.c|