Re: Unified File API

From: vignesh C <vignesh21(at)gmail(dot)com>
To: John Morris <john(dot)morris(at)crunchydata(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Stephen Frost <stephen(at)crunchydata(dot)com>, David Christensen <david(dot)christensen(at)crunchydata(dot)com>
Subject: Re: Unified File API
Date: 2024-01-06 17:28:30
Message-ID: CALDaNm0ywvupLxUtQARcFqn5E4CqTCAACTzgAzZR3Oj_7WAkyg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, 29 Jun 2023 at 13:20, John Morris <john(dot)morris(at)crunchydata(dot)com> wrote:
>
> Background
>
> ==========
>
> PostgreSQL has an amazing variety of routines for accessing files. Consider just the “open file” routines.
> PathNameOpenFile, OpenTemporaryFile, BasicOpenFile, open, fopen, BufFileCreateFileSet,
>
> BufFileOpenFileSet, AllocateFile, OpenTransientFile, FileSetCreate, FileSetOpen, mdcreate, mdopen,
>
> Smgr_open,
>
>
>
> On the downside, “amazing variety” also means somewhat confusing and difficult to add new features.
> Someday, we’d like to add encryption or compression to the various PostgreSql files.
> To do that, we need to bring all the relevant files into a common file API where we can implement
> the new features.
>
>
>
> Goals of Patch
>
> =============
>
> 1)Unify file access so most of “the other” files can go through a common interface, allowing new features
> like checksums, encryption or compression to be added transparently. 2) Do it in a way which doesn’t
> change the logic of current code. 3)Convert a reasonable set of callers to use the new interface.
>
>
>
> Note the focus is on the “other” files. The buffer cache and the WAL have similar needs,
> but they are being done in a separate project. (yes, the two projects are coordinating)
>
> Patch 0001. Create a common file API.
>
> ===============================
>
> Currrently, PostgreSQL files feed into three funnels. 1) system file descriptors (read/write/open),
> 2) C library buffered files (fread/fwri;te/fopn), and 3) virtual file descriptors (FileRead/FileWrite/PathNameOpenFile).
> Of these three, virtual file descriptors (VFDs) are the most common. They are also the
> only funnel which is implemented by PostgresSql.
>
>
>
> Decision: Choose VFDs as the common interface.
>
>
>
> Problem: VFDs are random access only.
>
> Solution: Add sequential read/write code on top of VFDs. (FileReadSeq, FileWriteSeq, FileSeek, FileTell, O_APPEND)
>
>
>
> Problem: VFDs have minimal error handling (based on errno.)
>
> Solution: Add an “ferror” style interface (FileError, FileEof, FileErrorCode, FileErrorMsg)
>
>
>
> Problem: Must maintain compatibility with existing error handling code.
>
> Solution: save and restore errno to minimize changes to existing code.
>
>
>
> Patch 0002. Update code to use the common file API
>
> ===========================================
>
> The second patch alters callers so they use VFDs rather than system or C library files.
> It doesn’t modify all callers, but it does capture many of the files which need
> to be encrypted or compressed. This is definitely WIP.
>
>
>
> Future (not too far away)
>
> =====================
>
> Looking ahead, there will be another set of patches which inject buffering and encryption into
> the VFD interface. The future patches will build on the current work and introduce new “oflags”
>
> to enable encryption and buffering.
>
>
> Compression is also a possibility, but currently lower priority and a bit tricky for random access files.
> Let us know if you have a use case.

CFbot shows few compilation warnings/error at [1]:
[15:54:06.825] ../src/backend/storage/file/fd.c:2420:11: warning:
unused variable 'save_errno' [-Wunused-variable]
[15:54:06.825] int ret, save_errno;
[15:54:06.825] ^
[15:54:06.825] ../src/backend/storage/file/fd.c:4026:29: error: use of
undeclared identifier 'MAXIMUM_VFD'
[15:54:06.825] Assert(file >= 0 && file < MAXIMUM_VFD);
[15:54:06.825] ^
[15:54:06.825] 1 warning and 1 error generated.

[1] - https://cirrus-ci.com/task/6552527404007424

Regards,
Vignesh

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Joe Conway 2024-01-06 17:39:17 Re: Password leakage avoidance
Previous Message vignesh C 2024-01-06 17:25:25 Re: abi-compliance-checker