Re: standby recovery fails (tablespace related) (tentative patch and discussion)

From: Asim R P <apraveen(at)pivotal(dot)io>
To: Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>
Cc: Paul Guo <pguo(at)pivotal(dot)io>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: standby recovery fails (tablespace related) (tentative patch and discussion)
Date: 2019-09-10 11:42:55
Message-ID: CANXE4TdfLYEQmjwQdMuh9D3crdcyLUhsy056H6_96=BAQA2nxQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Anastasia

On Thu, Aug 22, 2019 at 9:43 PM Anastasia Lubennikova <
a(dot)lubennikova(at)postgrespro(dot)ru> wrote:
>
> But during the review, I found a bug in the current implementation.
> New behavior must apply to crash-recovery only, now it applies to
archiveRecovery too.
> That can cause a silent loss of a tablespace during regular standby
operation
> since it never calls CheckRecoveryConsistency().
>
> Steps to reproduce:
> 1) run master and replica
> 2) create dir for tablespace:
> mkdir /tmp/tblspc1
>
> 3) create tablespace and database on the master:
> create tablespace tblspc1 location '/tmp/tblspc1';
> create database db1 tablespace tblspc1 ;
>
> 4) wait for replica to receive this changes and pause replication:
> select pg_wal_replay_pause();
>
> 5) move replica's tablespace symlink to some empty directory, i.e.
/tmp/tblspc2
> mkdir /tmp/tblspc2
> ln -sfn /tmp/tblspc2 postgresql_data_replica/pg_tblspc/16384
>

By changing the tablespace symlink target, we are silently nullifying
effects of a committed transaction from the standby data directory - the
directory structure created by the standby for create tablespace
transaction. This step, therefore, does not look like a valid test case to
me. Can you share a sequence of steps that does not involve changing data
directory manually?

>
> Also some nitpicking about code style:
> 1) Please, add comment to forget_missing_directory().
>
> 2) + elog(LOG, "Directory \"%s\" was missing during
directory copying "
> I think we'd better update this message elevel to WARNING.
>
> 3) Shouldn't we also move FlushDatabaseBuffers(xlrec->src_db_id); call
under
> if (do_copydir) clause?
> I don't see a reason to flush pages that we won't use later.
>

Thank you for the review feedback. I agree with all the points. Let me
incorporate them (I plan to pick this work up and drive it to completion as
Paul got busy with other things).

But before that I'm revisiting another solution upthread, that of creating
restart points when replaying create/drop database commands before making
filesystem changes such as removing a directory. The restart points should
align with checkpoints on master. The concern against this solution was
creation of restart points will slow down recovery. I don't think crash
recovery is affected by this solution because of the already existing
enforcement of checkpoints. WAL records prior to a create/drop database
will not be seen by crash recovery due to the checkpoint enforced during
the command's normal execution.

Asim

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Noah Misch 2019-09-10 11:45:17 Re: [HACKERS] WAL logging problem in 9.4.3?
Previous Message Amit Kapila 2019-09-10 11:19:46 Re: Berserk Autovacuum (let's save next Mandrill)