Re: Corruption during WAL replay

From: Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com>
To: tejeswarm(at)hotmail(dot)com, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, hlinnaka(at)iki(dot)fi, Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, hexexpert(at)comcast(dot)net
Subject: Re: Corruption during WAL replay
Date: 2021-03-04 17:37:23
Message-ID: CALtqXTewTHP=SJYf4X=Xk2dMf5GQmF34BTQ7a+s7+3yEt-MFpg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jan 6, 2021 at 1:33 PM Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
wrote:

> At Mon, 17 Aug 2020 11:22:15 -0700, Andres Freund <andres(at)anarazel(dot)de>
> wrote in
> > Hi,
> >
> > On 2020-08-17 14:05:37 +0300, Heikki Linnakangas wrote:
> > > On 14/04/2020 22:04, Teja Mupparti wrote:
> > > > Thanks Kyotaro and Masahiko for the feedback. I think there is a
> > > > consensus on the critical-section around truncate,
> > >
> > > +1
> >
> > I'm inclined to think that we should do that independent of the far more
> > complicated fix for other related issues.
> ...
> > > Perhaps a better approach would be to prevent the checkpoint from
> > > completing, until all in-progress truncations have completed. We have a
> > > mechanism to wait out in-progress commits at the beginning of a
> checkpoint,
> > > right after the redo point has been established. See comments around
> the
> > > GetVirtualXIDsDelayingChkpt() function call in CreateCheckPoint(). We
> could
> > > have a similar mechanism to wait out the truncations before
> *completing* a
> > > checkpoint.
> >
> > What I outlined earlier *is* essentially a way to do so, by preventing
> > checkpointing from finishing the buffer scan while a dangerous state
> > exists.
>
> Seems reasonable. The attached does that. It actually works for the
> initial case.
>
> regards.
>
> --
> Kyotaro Horiguchi
> NTT Open Source Software Center
>

The regression is failing for this patch, do you mind look at that and send
the updated patch?

https://api.cirrus-ci.com/v1/task/6313174510075904/logs/test.log

...
t/006_logical_decoding.pl ............ ok
t/007_sync_rep.pl .................... ok
Bailout called. Further testing stopped: system pg_ctl failed
FAILED--Further testing stopped: system pg_ctl failed
make[2]: *** [Makefile:19: check] Error 255
make[1]: *** [Makefile:49: check-recovery-recurse] Error 2
make: *** [GNUmakefile:71: check-world-src/test-recurse] Error 2
...

--
Ibrar Ahmed

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Juan José Santamaría Flecha 2021-03-04 17:37:56 Re: Add support for PROVE_FLAGS and PROVE_TESTS in MSVC scripts
Previous Message Amul Sul 2021-03-04 17:32:27 Re: [Patch] ALTER SYSTEM READ ONLY