Re: standby recovery fails (tablespace related) (tentative patch and discussion)

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: pguo(at)pivotal(dot)io
Cc: apraveen(at)pivotal(dot)io, pgsql-hackers(at)postgresql(dot)org
Subject: Re: standby recovery fails (tablespace related) (tentative patch and discussion)
Date: 2019-04-22 07:15:13
Message-ID: 20190422.161513.258021727.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello.

At Mon, 22 Apr 2019 12:36:43 +0800, Paul Guo <pguo(at)pivotal(dot)io> wrote in <CAEET0ZGpUrMGUzfyzVF9FuSq+zb=QovYa2cvyRnDOTvZ5vXxTw(at)mail(dot)gmail(dot)com>
> Please see my replies inline. Thanks.
>
> On Fri, Apr 19, 2019 at 12:38 PM Asim R P <apraveen(at)pivotal(dot)io> wrote:
>
> > On Wed, Apr 17, 2019 at 1:27 PM Paul Guo <pguo(at)pivotal(dot)io> wrote:
> > >
> > > create db with tablespace
> > > drop database
> > > drop tablespace.
> >
> > Essentially, that sequence of operations causes crash recovery to fail
> > if the "drop tablespace" transaction was committed before crashing.
> > This is a bug in crash recovery in general and should be reproducible
> > without configuring a standby. Is that right?
> >
>
> No. In general, checkpoint is done for drop_db/create_db/drop_tablespace on
> master.
> That makes the file/directory update-to-date if I understand the related
> code correctly.
> For standby, checkpoint redo does not ensure that.

That's right partly. As you must have seen, fast shutdown forces
restartpoint for the last checkpoint and it prevents the problem
from happening. Anyway it seems to be a problem.

> > Your patch creates missing directories in the destination. Don't we
> > need to create the tablespace symlink under pg_tblspc/? I would
> >
>
> 'create db with tablespace' redo log does not include the tablespace real
> directory information.
> Yes, we could add in it into the xlog, but that seems to be an overdesign.

But I don't think creating directory that is to be removed just
after is a wanted solution. The directory most likely to be be
removed just after.

> > prefer extending the invalid page mechanism to deal with this, as
> > suggested by Ashwin off-list. It will allow us to avoid creating
>
> directories and files only to remove them shortly afterwards when the
> > drop database and drop tablespace records are replayed.
> >
> >
> 'invalid page' mechanism seems to be more proper for missing pages of a
> file. For
> missing directories, we could, of course, hack to use that (e.g. reading
> any page of
> a relfile in that database) to make sure the tablespace create code
> (without symlink)
> safer (It assumes those directories will be deleted soon).
>
> More feedback about all of the previous discussed solutions is welcome.

It doesn't seem to me that the invalid page mechanism is
applicable in straightforward way, because it doesn't consider
simple file copy.

Drop failure is ignored any time. I suppose we can ignore the
error to continue recovering as far as recovery have not reached
consistency. The attached would work *at least* your case, but I
haven't checked this covers all places where need the same
treatment.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment Content-Type Size
ignore_dir_create_error_before_consistency.patch text/x-patch 2.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Konstantin Knizhnik 2019-04-22 07:38:18 Re: block-level incremental backup
Previous Message Fabien COELHO 2019-04-22 07:04:08 Re: [PATCH v1] Add \echo_stderr to psql