| From: | Michael Paquier <michael(at)paquier(dot)xyz> |
|---|---|
| To: | Dharin Shah <dharinshah95(at)gmail(dot)com> |
| Cc: | Robert Treat <rob(at)xzilla(dot)net>, Peter Eisentraut <peter(at)eisentraut(dot)org>, pgsql-hackers(at)lists(dot)postgresql(dot)org |
| Subject: | Re: Fwd: [PATCH] Add zstd compression for TOAST using extended header format |
| Date: | 2025-12-29 23:45:47 |
| Message-ID: | aVMSqzx7dGSDu4uG@paquier.xyz |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgadmin-hackers pgsql-hackers |
On Mon, Dec 29, 2025 at 02:45:27PM +0100, Dharin Shah wrote:
> The goal is to centralize “how do we interpret an external datum?” so that
> detoast code paths don’t have to reason directly about va_extinfo encoding
> vs payload layout details. This is intended as groundwork for a follow-up
> patch adding a new vartag-based method (e.g., zstd) without scattering
> special cases in detoast paths.
+static bool
+decode_external_toast_pointer(const struct varlena *attr,
+ DecodedExternalToast *decoded)
[...]
+typedef enum ToastDecompressMethod
+{
+ TOAST_DECOMP_NONE = 0,
+ TOAST_DECOMP_PGLZ = 1,
+ TOAST_DECOMP_LZ4 = 2
+} ToastDecompressMethod;
+
+typedef struct DecodedExternalToast
+{
+ Oid toastrelid;
+ Oid valueid;
+ uint32 rawsize; /* Decompressed size; for future methods without tcinfo */
+ uint32 extsize; /* On-disk payload size */
+ ToastDecompressMethod method;
+ uint8 flags;
+} DecodedExternalToast;
Yeah, honestly this is a layer I have been thinking about as well as
one option, but contrary to you I have been focusing on putting that
into varatt.h, with the exception of the value being an Oid8. I think
that you have an interesting point in focusing your implementation to
be stored in the detoast part, though. I'd need to spend a bit more
time to see the result this would lead at with the larger 8-byte issue
in mind, but this is something that would come at no real cost as it
has no function pointer redirection compared to what I was first
envisioning on the other thread. That's especially true if it makes
the CompressionId business easier to mold around when adding a new
vartag.
> Why HAS_TCINFO?
> - Previously, “is compressed?” was used as a proxy for whether the external
> payload begins with tcinfo. This patch makes that explicit: HAS_TCINFO
> captures payload layout, which is distinct from whether the value is
> compressed. This separation is needed for future methods that may store
> external compressed payloads without tcinfo.
It is possible to model the on-memory data as we want. This
suggestion would be OK with some flags.
--
Michael
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Tom Lane | 2025-12-29 23:50:42 | Re: lsyscache: free IndexAmRoutine objects returned by GetIndexAmRoutineByAmId() |
| Previous Message | Matthias van de Meent | 2025-12-29 23:43:12 | Re: lsyscache: free IndexAmRoutine objects returned by GetIndexAmRoutineByAmId() |
| From | Date | Subject | |
|---|---|---|---|
| Previous Message | Dharin Shah | 2025-12-29 13:45:27 | Re: Fwd: [PATCH] Add zstd compression for TOAST using extended header format |