| From: | Henson Choi <assam258(at)gmail(dot)com> |
|---|---|
| To: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, sawada(dot)mshk(at)gmail(dot)com, Tatsuo Ishii <ishii(at)postgresql(dot)org> |
| Subject: | Re: RFC: PostgreSQL Storage I/O Transformation Hooks |
| Date: | 2025-12-28 09:47:21 |
| Message-ID: | CAAAe_zDRQEcQ6RS1G-rtdktX-6MKobrpt-x8KQtY+i4-jL9mXg@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hello,
Following up on the RFC, I am submitting the initial patch set for the
proposed infrastructure. These patches introduce a minimal hook-based
protocol to allow extensions to handle data transformation, such as TDE,
while keeping the PostgreSQL core independent of specific cryptographic
implementations.
Implementation Details:
Hook Points in Storage I/O Path
The patch introduces five strategic hook points:
mdread_post_hook: Called after blocks are read from disk. The extension can
reverse-transform data in place.
mdwrite_pre_hook & mdextend_pre_hook: Called before writing or extending
blocks. These hooks return a pointer to transformed buffers.
xlog_insert_pre_hook & xlog_decode_pre_hook: Handle transformation for WAL
records during insertion and replay.
Data Integrity and Checksum Protocol
To ensure robust error detection, the hooks follow a specific verification
protocol:
On Write: The extension transforms the page, sets the Transform ID, then
recalculates the checksum on the transformed data.
On Read: The extension verifies the on-disk checksum of the transformed
data first. After reverse-transformation, it clears the Transform ID and
recalculates the checksum for the plaintext data. This ensures corruption
is detected regardless of the transformation state.
WAL Safety via XLR_BLOCK_ID_TRANSFORMED (251)
For WAL records, I have introduced a specific block ID (251) to mark
transformed data. If the decryption extension is not loaded, the WAL reader
will encounter this unknown block ID and fail-fast, preventing the system
from incorrectly interpreting encrypted data as valid WAL records.
PageHeader Transform ID (5-bit)
I have allocated bits 3-7 of pd_flags in the PageHeader for a Transform ID.
This allows the engine and extensions to identify the transformation state
of a page (e.g., key versioning or algorithm type) without attempting
decryption. It ensures backward compatibility: pages with Transform ID 0
are treated as standard untransformed pages.
Memory and Critical Section Safety
As demonstrated in the contrib/test_tde reference implementation, cipher
contexts are pre-allocated in _PG_init to avoid memory allocation during
critical sections. For WAL transformation,
MemoryContextAllowInCriticalSection() is used to allow buffer reallocation
within critical sections; if OOM occurs during buffer growth, it results in
a controlled PANIC.
Performance Considerations
When hooks are not set (default), the overhead is limited to a single NULL
pointer comparison per I/O operation. This is architecturally consistent
with existing PostgreSQL hooks and is designed to have a negligible impact
on performance.
Attached Patches:
v20251228-0001-Add-Storage-I-O-Transform-Hooks-for-PostgreSQL.patch: Core
infrastructure.
v20251228-0002-Add-test_tde-extension-for-TDE-testing.patch: Reference
implementation using AES-256-CTR.
I look forward to your comments and feedback.
Regards,
Henson Choi
2025년 12월 28일 (일) PM 4:49, Henson Choi <assam258(at)gmail(dot)com>님이 작성:
> RFC: PostgreSQL Storage I/O Transformation Hooks Infrastructure for a
> Technical Protocol Between RDBMS Core and Data Security Experts
>
> *Author:* Henson Choi assam258(at)gmail(dot)com
>
> *Date:* 2025-12-28
>
> *PostgreSQL Version:* master (Development)
> ------------------------------
> 1. Summary & Motivation
>
> This RFC proposes the introduction of minimal hooks into the PostgreSQL
> storage layer and the addition of a *Transformation ID* field to the
> PageHeader.
> A Diplomatic Protocol Between Expert Groups
>
> The core motivation of this proposal is *“Separation of Concerns and
> Mutual Respect.”*
>
> Historically, discussions around Transparent Data Encryption (TDE) have
> often felt like putting security experts on trial in a foreign
> court—specifically, the “Court of RDBMS.” It is time to treat them not as
> defendants to be judged by database-specific rules, but as an *equal
> neighboring community* with their own specialized sovereignty.
>
> *The issue has never been a failure of technology, but rather a
> misplacement of the focal point.* While previous discussions were mired
> in the technicalities of “how to hardcode encryption into the core,” this
> proposal shifts the debate toward an architectural solution: “what
> interface the core should provide to external experts.”
>
> - *RDBMS Experts* provide a trusted pipeline responsible for data I/O
> paths and consistency.
> - *Security Experts* take responsibility for the specialized domain of
> encryption algorithms and key management.
>
> This hook system functions as a *Technical Protocol*—a high-level
> agreement that allows these two expert groups to exchange data securely
> without encroaching on each other’s territory.
> ------------------------------
> 2. Design Principles
>
> 1. *Delegation of Authority:* The core remains independent of specific
> encryption standards, providing a “free territory” where security experts
> can respond to an ever-changing security landscape.
> 2. *Diplomatic Convention:* The Transformation ID acts as a
> communication protocol between the engine and the extension. The engine
> uses this ID to identify the state of the data and hands over control to
> the appropriate expert (the extension).
> 3. *Minimal Interference:* Overhead is kept near zero when hooks are
> not in use, ensuring the native performance of the PostgreSQL engine.
>
> ------------------------------
> 3. Proposal Specifications 3.1 The Interface (Hook Points)
>
> We allow intervention by security experts through five contact points
> along the I/O path:
>
> - *Read/Write Hooks:* mdread_post, mdwrite_pre, mdextend_pre
> (Transformation of the data area)
> - *WAL Hooks:* xlog_insert_pre, xlog_decode_pre (Transformation of
> transaction logs)
>
> 3.2 The Protocol Identifier (PageHeader Transformation ID)
>
> We allocate 5 bits of pd_flags to define the “Security State” of a page.
> This serves as a *Status Message* sent by the security expert to the
> engine, utilized for key versioning and as a migration marker.
> ------------------------------
> 4. Reference Implementation: contrib/test_tde A Standard Code of Conduct
> for Security Experts
>
> This reference implementation exists not as a commercial product, but to
> define the *Standards of the Diplomatic Protocol* that
> encryption/decryption experts must follow when entering the PostgreSQL
> domain.
>
> 1. *Deterministic IV Derivation:* Demonstrates how to achieve
> cryptographic safety by trusting unique values provided by the engine
> (e.g., LSN).
> 2. *Critical Section Safety:* Defines memory management regulations
> that security logic must follow within “Critical Sections” to maintain
> system stability.
> 3. *Hook Chaining:* Demonstrates a cooperative structure that allows
> peaceful coexistence with other expert tools (e.g., compression, auditing).
>
> ------------------------------
> 5. Scope
>
> - *In-Scope:* Backend hook infrastructure, Transformation ID field,
> and reference code demonstrating diplomatic protocol compliance.
> - *Out-of-Scope:* Specific Key Management Systems (KMS), selection of
> specific cryptographic algorithms, and integration with external tools.
>
> This proposal represents a strategic diplomatic choice: rather than the
> PostgreSQL core assuming all security responsibilities, it grants security
> experts a *sovereign territory through extensions* where they can perform
> at their best.
>
| Attachment | Content-Type | Size |
|---|---|---|
| v20251228-0001-Add-Storage-I-O-Transform-Hooks-for-PostgreSQL.patch | application/x-patch | 14.8 KB |
| v20251228-0002-Add-test_tde-extension-for-TDE-testing.patch | application/x-patch | 49.6 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Henson Choi | 2025-12-28 10:44:33 | Re: RFC: PostgreSQL Storage I/O Transformation Hooks |
| Previous Message | zengman | 2025-12-28 09:15:13 | Re: [Patch] timezone/zic.c: Fix file handle leak in dolink() |