Re: [PATCH] buffile: ensure start offset is aligned with BLCKSZ

From: Antonin Houska <ah(at)cybertec(dot)at>
To: Sasasu <i(at)sasa(dot)su>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org, sfrost(at)snowman(dot)net
Subject: Re: [PATCH] buffile: ensure start offset is aligned with BLCKSZ
Date: 2021-11-29 10:05:00
Message-ID: 20253.1638180300@antos
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Sasasu <i(at)sasa(dot)su> wrote:

> Hi hackers,
>
> there are a very long discuss about TDE, and we agreed on that if the
> temporary file I/O can be aligned to some fixed size, it will be easier
> to use some kind of encryption algorithm.
>
> discuss:
> https://www.postgresql.org/message-id/20211025155814.GD20998%40tamriel.snowman.net
>
> This patch adjust file->curOffset and file->pos before the real IO to
> ensure the start offset is aligned.

Does this test really pass regression tests? In BufFileRead(), I would
understand if you did

+ file->pos = offsetInBlock;
+ file->curOffset -= offsetInBlock;

rather than

+ file->pos += offsetInBlock;
+ file->curOffset -= offsetInBlock;

Anyway, BufFileDumpBuffer() does not seem to enforce curOffset to end up at
block boundary, not to mention BufFileSeek().

When I was implementing this for our fork [1], I concluded that the encryption
code path is too specific, so I left the existing code for the unecrypted data
and added separate functions for the encrypted data.

One specific thing is that if you encrypt and write n bytes, but only need
part of it later, you need to read and decrypt exactly those n bytes anyway,
otherwise the decryption won't work. So I decided not only to keep curOffset
at BLCKSZ boundary, but also to read / write BLCKSZ bytes at a time. This also
makes sense if the scope of the initialization vector (IV) is BLCKSZ bytes.

Another problem is that you might want to store the IV somewhere in between
the data. In short, the encryption makes the buffered IO rather different and
the specific code should be kept aside, although the same API is used to
invoke it.

--
Antonin Houska
Web: https://www.cybertec-postgresql.com

[1] https://github.com/cybertec-postgresql/postgres/tree/PG_14_TDE_1_1

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2021-11-29 10:11:16 Re: row filtering for logical replication
Previous Message kuroda.hayato@fujitsu.com 2021-11-29 09:57:22 RE: [Proposal] Add foreign-server health checks infrastructure