Re: better page-level checksums

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: better page-level checksums
Date: 2022-06-14 15:47:49
Message-ID: CA+TgmoZK10Ck0GqgrQMhK376oRUgHyc6-7DDqJ-mXVCfsdnW1g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jun 13, 2022 at 6:26 PM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> Anyway, I can see how it would be useful to be able to know the offset
> of a nonce or of a hash digest on any given page, without access to a
> running server. But why shouldn't that be possible with other designs,
> including designs closer to what I've outlined?

I don't know what you mean by this. As far as I'm aware, the only
design you've outlined is one where the space wasn't at the same
offset on every page.

> A known fixed offset in the special area already assumes that all
> pages must have a value in the first place, even though that won't be
> true for the majority of individual Postgres servers. There is
> implicit information involved in a design like the one Robert has
> proposed; your backup tool (or whatever) already has to understand to
> expect something other than no encryption at all, or no checksum at
> all. Tools like pg_filedump already rely on implicit information about
> the special area.

In general, I was imagining that you'd need to look at the control
file to understand how much space had been reserved per page in this
particular cluster. I agree that's a bit awkward, especially for
pg_filedump. However, pg_filedump and I think also some code internal
to PostgreSQL try to figure out what kind of page we've got by looking
at the *size* of the special space. It's only good luck that we
haven't had a collision there yet, and continuing to rely on that
seems like a dead end. Perhaps we should start including a per-AM
magic number at the beginning of the special space.

> I'm not against the idea of picking a handful of checksum/encryption
> schemes, with the understanding that we'll be committing to those
> particular schemes indefinitely -- it's not reasonable to expect
> infinite flexibility here (and so I don't). But why should we accept
> something that seems to me to be totally inflexible, and doesn't
> compose with other things?

We shouldn't accept something that's totally inflexible, but I don't
know why this seems that way to you.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dirschel, Steve 2022-06-14 15:58:39 Postgres NOT IN vs NOT EXISTS optimization
Previous Message Matthias van de Meent 2022-06-14 15:08:43 Re: better page-level checksums