From: | Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru> |
---|---|
To: | Asim R P <apraveen(at)pivotal(dot)io> |
Cc: | Paul Guo <pguo(at)pivotal(dot)io>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: standby recovery fails (tablespace related) (tentative patch and discussion) |
Date: | 2019-09-11 14:26:44 |
Message-ID: | a82a896b-93f0-c26c-b941-f5665131381b@postgrespro.ru |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
10.09.2019 14:42, Asim R P wrote:
> Hi Anastasia
>
> On Thu, Aug 22, 2019 at 9:43 PM Anastasia Lubennikova
> <a(dot)lubennikova(at)postgrespro(dot)ru <mailto:a(dot)lubennikova(at)postgrespro(dot)ru>>
> wrote:
> >
> > But during the review, I found a bug in the current implementation.
> > New behavior must apply to crash-recovery only, now it applies to
> archiveRecovery too.
> > That can cause a silent loss of a tablespace during regular standby
> operation
> > since it never calls CheckRecoveryConsistency().
> >
> > Steps to reproduce:
> > 1) run master and replica
> > 2) create dir for tablespace:
> > mkdir /tmp/tblspc1
> >
> > 3) create tablespace and database on the master:
> > create tablespace tblspc1 location '/tmp/tblspc1';
> > create database db1 tablespace tblspc1 ;
> >
> > 4) wait for replica to receive this changes and pause replication:
> > select pg_wal_replay_pause();
> >
> > 5) move replica's tablespace symlink to some empty directory, i.e.
> /tmp/tblspc2
> > mkdir /tmp/tblspc2
> > ln -sfn /tmp/tblspc2 postgresql_data_replica/pg_tblspc/16384
> >
>
> By changing the tablespace symlink target, we are silently nullifying
> effects of a committed transaction from the standby data directory -
> the directory structure created by the standby for create tablespace
> transaction. This step, therefore, does not look like a valid test
> case to me. Can you share a sequence of steps that does not involve
> changing data directory manually?
>
Hi, the whole idea of the test is to reproduce a data loss. For example,
if the disk containing this tablespace failed.
Probably, simply deleting the directory
'postgresql_data_replica/pg_tblspc/16384'
would work as well, though I was afraid that it can be caught by some
earlier checks and my example won't be so illustrative.
>
> Thank you for the review feedback. I agree with all the points. Let
> me incorporate them (I plan to pick this work up and drive it to
> completion as Paul got busy with other things).
>
> But before that I'm revisiting another solution upthread, that of
> creating restart points when replaying create/drop database commands
> before making filesystem changes such as removing a directory. The
> restart points should align with checkpoints on master. The concern
> against this solution was creation of restart points will slow down
> recovery. I don't think crash recovery is affected by this solution
> because of the already existing enforcement of checkpoints. WAL
> records prior to a create/drop database will not be seen by crash
> recovery due to the checkpoint enforced during the command's normal
> execution.
>
I haven't measured the impact of generating extra restart points in
previous solution, so I cannot tell whether concerns upthread are
justified. Still, I enjoy latest design more, since it is clear and
similar with the code of checking unexpected uninitialized pages. In
principle it works. And the issue, I described in previous review can be
easily fixed by several additional checks of InHotStandby macro.
--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
From | Date | Subject | |
---|---|---|---|
Next Message | Daniel Verite | 2019-09-11 14:53:16 | Create collation reporting the ICU locale display name |
Previous Message | Fabien COELHO | 2019-09-11 14:25:07 | Re: [patch]socket_timeout in interfaces/libpq |