Re: storing an explicit nonce

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Tom Kincaid <tomjohnkincaid(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>
Subject: Re: storing an explicit nonce
Date: 2021-05-25 18:45:40
Message-ID: 20210525184540.GB3048@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, May 25, 2021 at 12:46:45PM -0400, Robert Haas wrote:
> On Thu, Mar 18, 2021 at 2:59 PM Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> > > Ultimately, we need to make sure that LSNs aren't re-used. There's two
> > > sources of LSNs today: those for relations which are being written into
> > > the WAL and those for relations which are not (UNLOGGED relations,
> > > specifically). The 'minimal' WAL level introduces complications with
> >
> > Well, the story is a little more complex than that --- we currently have
> > four LSN uses:
> >
> > 1. real LSNs for WAL-logged relfilenodes
> > 2. real LSNs for GiST indexes for non-WAL-logged relfilenodes of permanenet relations
> > 3. fake LSNs for GiST indexes for relfilenodes of non-permanenet relations
> > 4. zero LSNs for non-GiST non-permanenet relations
> >
> > This patch changes it so #4 gets fake LSNs, and slightly adjusts #2 & #3
> > so the LSNs are always unique.
>
> Hi!
>
> This approach has a few disadvantages. For example, right now, we only
> need to WAL log hints for the first write to each page after a
> checkpoint, but in this approach, if the same page is written multiple
> times per checkpoint cycle, we'd need to log hints every time. In some
> workloads that could be quite expensive, especially if we log an FPI
> every time.

Well, if we create a separate nonce counter, we still need to make sure
it doesn't go backwards during a crash, so we have to WAL log it
somehow, perhaps at a certain interval like 1k and advance the counter
by 1k in case of crash recovery, like we do with the oid counter now, I
think.

The buffer encryption overhead is 2-4%, and WAL encryption is going to
add to that, so I thought hint bit logging overhead would be minimal
in comparison.

> Also, I think that all sorts of non-permanent relations currently get
> zero LSNs, not just GiST. Every unlogged table and every temporary
> table would need to use fake LSNs. Moreover, for unlogged tables, the
> buffer manager would need changes, because it is otherwise going to
> assume that anything it sees in the pd_lsn field other than a zero is
> a real LSN.

Have you looked at the code, specifically EncryptPage():

https://github.com/postgres/postgres/compare/bmomjian:cfe-11-gist..bmomjian:_cfe-12-rel.patch

+ if (!relation_is_permanent && !is_gist_page_or_similar)
+ PageSetLSN(page, LSNForEncryption(relation_is_permanent));

It assigns an LSN to unlogged pages. As far as the buffer manager
seeing fake LSNs that already happens for GiST indexes, so I just built
on that --- seemed to work fine.

> So I would like to propose an alternative: store the nonce in the
> page. Now the next question is where to put it. I think that putting
> it into the page header would be far too invasive, so I propose that
> we instead store it at the end of the page, as part of the special
> space. That makes an awful lot of code not really notice that anything
> is different, because it always thought that the usable space on the
> page ended where the special space begins, and it doesn't really care
> where that is exactly. The code that knows about the special space
> might care a little bit, but whatever private data it's storing is
> going to be at the beginning of the special space, and the nonce would
> be stored - in this proposal - at the end of the special space. So it
> turns out that it doesn't really care that much either.

I think the big problem with that is that it adds a new counter, with
new code, and it makes adding encryption offline, like we do for adding
checksums, pretty much impossible since the page might not have space
for a nonce. It also makes the idea of adding encryption as part of a
pg_upgrade non-link mode also impossible, at least for me. ;-)

I have to ask why we should consider adding it to the special space,
since my current version seems fine, and has minimal code impact, and
has some advantages over using the special space. Is it because of the
WAL hint overhead, or for a cleaner API, or something else?

Also, I need help with all the XXX comments I have in my patches before
I can move forward:

https://wiki.postgresql.org/wiki/Transparent_Data_Encryption#Patches

I stopped working on this to get beta out the door, but next week it
would be nice to continue on this. However, I want to get this patch
into a state where everyone is happy with it, rather than adding more
code with an unclear future.

--
Bruce Momjian <bruce(at)momjian(dot)us> https://momjian.us
EDB https://enterprisedb.com

If only the physical world exists, free will is an illusion.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2021-05-25 18:56:32 Re: storing an explicit nonce
Previous Message Tom Lane 2021-05-25 18:31:29 Re: Access root->simple_rte_array instead of Query->rtable for 2 more cases.