Re: storing an explicit nonce

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Stephen Frost <sfrost(at)snowman(dot)net>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Tom Kincaid <tomjohnkincaid(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>
Subject: Re: storing an explicit nonce
Date: 2021-05-25 18:25:21
Message-ID: CA+TgmobFO+cqE8anFVxwMvL1qSyEXSjgtCNqCaFgUVectkztuQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, May 25, 2021 at 1:37 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> The obvious concerns are issues around binary upgrades for cases that
> already use the special space? Are you planning to address that by not
> having that path? Or by storing the nonce at the "start" of the special
> space (i.e. [normal data][nonce][existing special])?

Well, there aren't any existing encrypted clusters, so what is the
scenario exactly? Perhaps you are thinking that we'd have a pg_upgrade
option that would take an unencrypted cluster and encrypt all the
pages, without any other page format changes. If so, this design would
preclude that choice, because there might be no free space available.

> Is there an argument for generalizing the nonce approach for to replace
> fake LSNs for unlogged relations?

I hadn't thought about that. Maybe. But that would require including
the nonce always, rather than only when TDE is selected, or including
it always in some kinds of pages and only conditionally in others,
which seems more complex.

> Why is using pd_special better than finding space for a flag bit in the
> header indicating whether it has a nonce? Using pd_special will burden
> all code using special space, and maybe even some that does not (think
> empty pages now occasionally having a non-zero pd_special), whereas
> implementing it on the page level wouldn't quite have the same concerns.

Well, I think there's a lot of code that knows where the line pointer
array starts, and all those calculations will have to become more
complex at runtime if we put the nonce anywhere near the start of the
page. I think there are way fewer things that care about the end of
the page. I dislike the idea that every call to PageGetItem() would
need to know the nonce size - there are hundreds of those calls and
making them more expensive seems a lot worse than the stuff this patch
changes.

It's always possible that I'm confused here, either about what you are
proposing or how impactful it would actually be...

> I do suspect having only the "no nonce" or "nonce is a compile time
> constant" cases would be good performance wise. Stuff like
>
> > +#define MaxHeapTupleSizeLimit (BLCKSZ - MAXALIGN(SizeOfPageHeaderData + \
> > + sizeof(ItemIdData)))
> > +#define MaxHeapTupleSize(tdeNonceSize) (BLCKSZ - MAXALIGN(SizeOfPageHeaderData + \
> > + sizeof(ItemIdData)) - MAXALIGN(tdeNonceSize))
>
> won't be free.

One question here is whether we're comfortable saying that the nonce
is entirely constant. I wasn't sure about that. It seems possible to
me that different encryption algorithms might want nonces of different
sizes, either now or in the future. I am not a cryptographer, but that
seemed like a bit of a limiting assumption. So Bharath and I decided
to make the POC cater to a fully variable-size nonce rather than
zero-or-some-constant. However, if the consensus is that
zero-or-some-constant is better, fair enough! The patch can certainly
be adjusted to cater to work that way.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2021-05-25 18:31:29 Re: Access root->simple_rte_array instead of Query->rtable for 2 more cases.
Previous Message Tom Lane 2021-05-25 18:20:25 Re: CALL versus procedures with output-only arguments