| From: | Henson Choi <assam258(at)gmail(dot)com> |
|---|---|
| To: | Zsolt Parragi <zsolt(dot)parragi(at)percona(dot)com> |
| Cc: | Konstantin Knizhnik <knizhnik(at)garret(dot)ru>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, sawada(dot)mshk(at)gmail(dot)com, Tatsuo Ishii <ishii(at)postgresql(dot)org> |
| Subject: | Re: RFC: PostgreSQL Storage I/O Transformation Hooks |
| Date: | 2025-12-28 15:25:01 |
| Message-ID: | CAAAe_zAosQq6e6DeReQjZcd7C85BPRCCYOBdyh41FgMrOxuVsg@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Subject: Re: RFC: PostgreSQL Storage I/O Transformation Hooks
Hi Zsolt,
Thank you for your detailed questions. I'll address each point:
1. Bundling WAL and Buffer Manager
WAL and heap pages are simply different representations of the same
underlying data. Protecting only one side would be cryptographically
incomplete; an attacker could bypass encryption by reading the
unprotected side. Therefore, they must be treated as a single atomic
unit of protection.
2. Scope: Temporary Files, System Tables, and Frontend Tools
I intentionally kept the scope focused. Past TDE proposals often stalled
because they tried to solve everything at once, becoming too large to
review. I prefer a "divide-and-conquer" approach:
- Temporary files: Out of scope for this initial infrastructure proposal.
- System tables: While they cannot be encrypted during bootstrap (since
extensions aren't loaded), they can be transformed page-by-page during
normal operation.
- Frontend tools (pg_waldump, etc.): I am aware of this and have modified
versions. Currently, there is no standard mechanism for frontend hooks,
making this a broader challenge. For production, extensions could ship
their own modified frontend tools temporarily. Long-term, we may need
initdb-time configurations to unify backend/frontend hook behavior
that are fixed for the lifetime of the cluster.
3. Why Hooks Instead of SMGR
Please see my response to Konstantin in this thread regarding maintenance
debt and the "Separation of Concerns" between storage management and data
transformation.
4. Page Header Flags vs. Fork Files
My primary concern with using fork files for encryption metadata is crash
recovery. If a fork file and the actual data page become inconsistent
(e.g., during a crash), recovery becomes problematic because fork files
are not typically protected by WAL.
Storing the Transform ID in the header flags ensures that the metadata
travels with the page. This is essential for incremental key rotation,
where pages are gradually re-encrypted with newer keys over time. The
oldest key's pages are force-rotated, allowing continuous key rotation
without service interruption. I plan to propose a separate RFC for this
"gradual rotation" mechanism.
5. Benchmarks and Critical Section Overhead
Transformation happens inside the critical section but before acquiring
the WAL lock. On consumer-grade SSDs, the encryption latency is largely
masked by I/O wait times with negligible performance impact. On
high-performance storage (production SSDs, Apple Silicon, etc.), the
reduced I/O wait exposes the encryption overhead, which is visible but
modest. Detailed benchmarks require company approval - I will follow up
later.
Best regards,
Henson Choi
2025년 12월 28일 (일) PM 10:12, Zsolt Parragi <zsolt(dot)parragi(at)percona(dot)com>님이 작성:
> Hello!
>
> I am glad to see that there are multiple TDE extension proposals being
> worked on. For context, I am one of the developers working on the
> pg_tde[1] extension, as well as on the extensible SMGR proposal that
> Konstantin already linked.
>
> This patch/proposal contains two distinct parts of
> encryption/extensibility, WAL and buffer manager/table data. Based on
> earlier discussions, the opinions of adding extension points to these
> two are quite different, and because of that I'm not sure if bundling
> them together is helpful.
>
> It also appears to be missing some extension points that would be
> required for a more complete encryption solution, such as encrypting
> temporary files or system tables, or handling command-line utilities
> like pg_waldump. Do you have ideas or patches in mind for those areas
> as well?
>
> I have the same question as Konstantin, why did you choose custom
> hooks for the buffer manager instead of the already existing smgr
> interface / extensibility patch? While that patch is not part of the
> core (but I hope it will be), it is already used by multiple companies
> as it supports other use cases, not only encryption. We plan to focus
> more on that thread early next year, we would appreciate any
> feedback/suggestions that could make it better for others.
>
> I also noticed that you added additional flags to the page header.
> Initially we were thinking about something like this, but decided that
> the fork files are better for any encryption (or other storage
> related) extra data. These few bits try to be generic, while also
> restrictive because of the limited amount of data. (and that data is
> specifically per page, if I want something per file or per page range,
> I still need a custom solution)
>
> Regarding the WAL encryption part, we took a completely different
> approach, similar to how we handle normal table data (page-based). I
> will need to think more about this before I can provide meaningful
> feedback on that part of the patch. One initial question, however, is
> whether you have run detailed benchmarks with different workloads.
> That seems to be the trickiest part there, since most of the code runs
> in a critical section. (Not the "unused"/"empty hook" path, but the
> overhead caused by a real encryption plugin using this hook in
> practice)
>
>
> [1]: https://github.com/percona/pg_tde
>
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Marcos Pegoraro | 2025-12-28 15:26:19 | Re: Get rid of "Section.N.N.N" on DOCs |
| Previous Message | Zsolt Parragi | 2025-12-28 15:20:52 | Re: RFC: PostgreSQL Storage I/O Transformation Hooks |