Re: Corruption during WAL replay

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)anarazel(dot)de>, Daniel Gustafsson <daniel(at)yesql(dot)se>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, deniel1495(at)mail(dot)ru, Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com>, tejeswarm(at)hotmail(dot)com, hlinnaka <hlinnaka(at)iki(dot)fi>, Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Daniel Wood <hexexpert(at)comcast(dot)net>
Subject: Re: Corruption during WAL replay
Date: 2022-03-29 16:34:16
Message-ID: 20220329163416.GO10577@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Greetings,

* Robert Haas (robertmhaas(at)gmail(dot)com) wrote:
> On Fri, Mar 25, 2022 at 10:34 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> > I dunno. Compatibility and speed concerns aside, that seems like an awful
> > lot of bits to be expending on every page compared to the value.
>
> I dunno either, but over on the TDE thread people seemed quite willing
> to expend like 16-32 *bytes* for page verifiers and nonces and things.

Absolutely.

> For compatibility and speed reasons, I doubt we could ever get by with
> doing that in every cluster, but I do have some hope of introducing
> something like that someday at least as an optional feature. It's not
> like a 16-bit checksum was state-of-the-art even when we introduced
> it. We just did it because we had 2 bytes that we could repurpose
> relatively painlessly, and not any larger number. And that's still the
> case today, so at least in the short term we will have to choose some
> other solution to this problem.

I agree that this would be great as an optional feature. Those patches
to allow the system to be built with reserved space for $whatever would
allow us to have a larger checksum for those who want it and perhaps a
nonce with TDE for those who wish that in the future. I mentioned
before that I thought it might be a good way to introduce page-level
epochs for 64bit xids too though it never seemed to get much traction.

Anyhow, this whole thread has struck me as a good reason to polish those
patches off and add on top of them an extended checksum ability, first,
independent of TDE, and remove the dependency of those patches from the
TDE effort and instead allow it to just leverage that ability. I still
suspect we'll have some folks who will want TDE w/o a per-page nonce and
that could be an option but we'd be able to support TDE w/ integrity
pretty easily, which would be fantastic.

Thanks,

Stephen

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michail Nikolaev 2022-03-29 16:51:18 Re: [PATCH] Full support for index LP_DEAD hint bits on standby
Previous Message Robert Haas 2022-03-29 16:22:21 Re: make MaxBackends available in _PG_init