Re: Bug with pg_basebackup and 'shared' tablespace

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: pierre(dot)ducroquet(at)people-doc(dot)com
Cc: pgsql-hackers(at)postgresql(dot)org, michael(dot)paquier(at)gmail(dot)com, daniel(at)yesql(dot)se, magnus(at)hagander(dot)net
Subject: Re: Bug with pg_basebackup and 'shared' tablespace
Date: 2017-09-29 05:19:17
Message-ID: 20170929.141917.20719591.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

At Tue, 19 Sep 2017 17:07:10 +0200, Pierre Ducroquet <pierre(dot)ducroquet(at)people-doc(dot)com> wrote in <1678633(dot)OBaBNztJnJ(at)pierred-pdoc>
> On Tuesday, September 19, 2017 12:52:37 PM CEST you wrote:
> > On Thu, Sep 14, 2017 at 2:17 AM, Pierre Ducroquet <p(dot)psql(at)pinaraf(dot)info>
> wrote:
> > > All my apologies for the schockingly long time with no answer on this
> > > topic.
> > No problem. That's the concept called life I suppose.
> >
> > > I will do my best to help review some patches in the current CF.
> >
> > Thanks for the potential help!
> >
> > > Attached to this email is the new version of the patch, checked against
> > > HEAD and REL_10_STABLE, with the small change required by Michael (affect
> > > true/ false to the boolean instead of 1/0) applied.
> >
> > Thanks for the new version.
> >
> > So I have been pondering about this patch, and allowing multiple
> > versions of Postgres to run in the same tablespace is mainly here for
> > upgrade purposes, so I don't think that we should encourage such
> > practices for performance reasons. There is as well
> > --tablespace-mapping which is here to cover a similar need and we know
> > that this solution works, at least people have spent time to make sure
> > it does.
> >
> > Another problem that I have with this patch is that on HEAD,
> > permission checks are done before receiving any data. I think that
> > this is really saner to do any checks like that first, so as you can
> > know if the tablespace is available for write access before writing
> > any data to it. With this patch, you may finish by writing a bunch of
> > data to a tablespace path, but then fail on another tablespace because
> > of permission issues so the backup becomes useless after transferring
> > and writing potentially hundreds of gigabytes. This is no fun to
> > detect that potentially too late, and can impact the responsiveness of
> > instances to get backups and restore from them.
> >
> > All in all, this patch gets a -1 from me in its current shape. If
> > Horiguchi-san or yourself want to process further with this patch, of
> > course feel free to do so, but I don't feel that we are actually
> > trying to solve a new problem here. I am switching the patch to
> > "waiting on author".

Hmm. I recall that my opinion on the "problem" was that we should
just prohibit sharing rather than allow it. However, if there's
who wants it and the behavior is not so bizarre, I don't reject
the direction.

As long as I saw the patch, it just delays digging of the top
directory until the CATVER directory becomes knwon without
additional checks. The behavior is identical to what the current
version does on tablespace directories so the pointed problems
don't seem to be new ones caused by this patch and such situation
seems a bit malicious.

> Hi
>
> The point of this patch has never been to 'promote' sharing tablespaces
> between PostgreSQL versions. This is not a documented feature, and it would be
> imho a bad idea to promote it because of bugs like this one.

That *was* a bug, but could be a not-a-bug after we define the
behavior reasonably, and may be should be documented.

> But that bug is a problem for people who are migrating between postgresql
> releases one database at a time on the same server (maybe not a best practice,
> but a real use case nevertheless). That's how I met this bug, and that's why I
> wanted to patch it. I know there is a workaround, but to me it seems counter-
> intuitive that with no warning I can create a postgresql cluster that can not
> be restored without additional pg_basebackup settings.
> If it is indeed forbidden to share a tablespace between postgresql releases,
> why don't we enforce it ? Or at least show a warning when CREATE TABLESPACE
> encounter a non-empty folder ?
>
> Regarding the permission check, I don't really see how this is a real problem
> with the patch. I have to check on master, but it seems to me that the
> permission check can still be done before starting writing data on disc. We
> could just delay the 'empty' folder check, and keep checking the folder
> permissions.
>
> And I will do the pgindent run, thanks for the nitpick :)

So I'll wait for the new version, but I have a comment on the
patch.

It would practically work but I don't like the fact that the
patch relies on the specific directory/file ordering in the tar
stream. This is not about the CATVER directory but lower
directories. Being more strict, it actually makes excessive calls
to verify_dir_is_em..() for more lower directories contrarily to
expectations.

I think that we can take more more robust way to detect the
CATVER directory. Just checking if it is a top-level directory
will work. That is, making the following change.

- if (firstfile && !basetablespace)
+ /* copybuf must contain at least one '/' here */
+ if (!basetablespace && strchr(copybuf, '/')[1] == 0)

This condition exactly hits only CATVER directories not being
disturbed by file ordering of the tar stream.

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavan Deolasee 2017-09-29 05:36:38 Re: pgbench stuck with 100% cpu usage
Previous Message Vaishnavi Prabakaran 2017-09-29 04:44:33 Re: Refactor handling of database attributes between pg_dump and pg_dumpall