| From: | Dharin Shah <dharinshah95(at)gmail(dot)com> |
|---|---|
| To: | Michael Paquier <michael(at)paquier(dot)xyz> |
| Cc: | Peter Eisentraut <peter(at)eisentraut(dot)org>, pgsql-hackers(at)lists(dot)postgresql(dot)org |
| Subject: | Re: Fwd: [PATCH] Add zstd compression for TOAST using extended header format |
| Date: | 2025-12-24 00:47:16 |
| Message-ID: | CAOj6k6f2B3hNxDcnB5AgHX4kaTW8XTAfMAjRx4upDBOugxqF4w@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgadmin-hackers pgsql-hackers |
Hello,
Following up on my earlier patch submission, I've reworked the zstd TOAST
compression implementation based on our discussion here. The new patch now
avoids the 20-byte extended header.
Current Approach
- New `VARTAG_ONDISK_ZSTD` (value 19) for ZSTD external storage
- Maintains existing 16-byte varatt_external structure
- ZSTD external-only (no inline compression)
Note: Using a dedicated VARTAG_ONDISK_ZSTD keeps the on-disk TOAST pointer
payload at 16 bytes, but it is not a general extensible metadata carrier.
If PostgreSQL later adopts a more general extensible TOAST framework, this
change should not block it; VARTAG_ONDISK_ZSTD would remain as a supported
legacy encoding, while new toasted values could be written using the newer
framework and old values rewritten via normal table rewrites.
Storage (170 MB uncompressed):
ZSTD: 22 MB (7.60x) - 38.7% space savings vs LZ4
PGLZ: 36 MB (4.76x)
LZ4: 36 MB (4.66x)
Key findings:
- Large values (>50KB): ZSTD 33% better compression than PGLZ (~30% better
than LZ4)
- Low-entropy data: ZSTD compresses what LZ77 methods cannot
- Small values: ZSTD pays external overhead vs inline PGLZ/LZ4
While ZSTD uses slightly less space overall, the external storage mechanism
incurs a TOAST fetch overhead for small values, potentially impacting
performance.
Backwards Compatibility Tests
- Mixed compression: Rows with PGLZ, LZ4, and ZSTD coexist and decompress
correctly
- Lazy recompression: ALTER COLUMN ... SET COMPRESSION zstd affects new
data; existing data is lazily recompressed upon UPDATE or VACUUM FULL.
- Inline vs external: Small values remain inline; large values use
appropriate external compression.
Data integrity: All data decompresses correctly across all methods.
Trade-offs and Design Considerations
- External-only avoids consuming cmid=3 and extended header complexity
- Slice access: no ZSTD-specific optimization (follow-up area)
- Hybrid inline/external for small values: not in this patch (feedback
welcome)
Reviewer Questions - Is vartag-based external-only acceptable?
- Should compression level (currently 3) be configurable? - Is the external
storage overhead for small values acceptable, or is hybrid inline/external
behavior needed?
Thanks, Dharin
On Thu, Dec 18, 2025 at 11:44 PM Michael Paquier <michael(at)paquier(dot)xyz>
wrote:
> On Thu, Dec 18, 2025 at 10:44:22PM +0100, Dharin Shah wrote:
> > I want to make sure I understand your main point: you're OK with a new
> > `vartag_external`, but prefer we avoid increasing the heap TOAST pointer
> > from 16 -> 20 bytes since every zstd-toasted value would pay +4 bytes in
> > the main heap tuple.
>
> That would be my choice, yes. Not sure about the opinion of others on
> this matter.
>
> > I also realize the "compatibility" of the extended header doesn't buy us
> > much — we'll need to support the existing 16-byte varatt_external forever
> > for backward compatibility. Adding a 20-byte structure just means two
> > formats to maintain indefinitely.
>
> Yes. Patches have to maintain on-disk compatibility.
>
> > A couple clarifying questions if we go with new vartag (e.g.,
> > `VARTAG_ONDISK_ZSTD`), same 16-byte `varatt_external` payload, vartag as
> > discriminator
> > 1. How should we handle future methods beyond zstd? One tag per method,
> or
> > store a method id elsewhere (e.g., in TOAST chunk header)?
>
> My suspicion would be that we could either use a new set of vartags in
> the future for each compression method. When it comes to zstd there
> is something that comes in play: we could set some bits related to
> dictionnaries at tuple level. Not sure if this is the best design or
> if using an attribute-level option is more adapted (for example a
> JSONB blob could be applied as an attribute with common keys in a
> dictionnary saving a lot of on-disk space even before compression),
> but keeping some bits free in the 16-byte header leaves this option
> open with a new vartag_external. Saying that, zstd is good enough
> that I strongly suspect that we would not regret it for quite a few
> years. One issue that has pushed towards the addition of lz4 as an
> option for toast compression is that pglz was worse in terms of CPU
> cost. zlib is also more expensive than lz4 or zstd, especially at
> very high compression level for usually little compression gains.
>
> > 2. And re: "as long as the TOAST value is 32 bits" — are you referring to
> > the 30-bit extsize field in va_extinfo (i.e., avoid stealing bits from
> > extsize for method encoding)?
>
> I mean extending the TOAST value to 8 bytes, as per the following
> issues:
> https://www.postgresql.org/message-id/764273.1669674269%40sss.pgh.pa.us
> https://commitfest.postgresql.org/patch/5830/
>
> > *Key findings (i guess well known at this point):*
> > - ZSTD excels for repetitive/pattern-heavy data (6.7x better than PGLZ)
> > - For low-redundancy data (MD5 hashes), ZSTD still achieves ~2x better
> > - The T4 result showing zstd as "worse" is not about compression quality
> -
> > it's about missing inline storage support. ZSTD actually compresses
> better,
> > but pays unnecessary TOAST overhead.
> >
> > I'll share the detailed benchmark script with the next patch revision.
> But
> > also a potential path forward could be that we could just fully replace
> > pglz (can bring it up later in different thread)
>
> I don't think that we will ever be able to remove pglz. It would be
> nice, as final result of course, but I also expect that not being able
> to decompress pglz data is going to lead to a lot of user pain. That
> would be also very expensive to check at upgrade for large instances.
>
> > *On Testing and Patch Structure*
> > Agreed on both points:
> > - I'll use `compression_zstd.sql` following the `compression_lz4.sql`
> > pattern (removing the test_toast_ext module)
>
> Okay.
>
> > - I'll split the GUC refactoring into a separate preparatory patch
>
> This refactoring, if done nicely, is worth an independent piece. It's
> something that I have actually done for the sake of the other thread,
> though the result was not really much liked by others. Perhaps I'm
> just lacking imagination with this abstraction, and I'd surely welcome
> different ideas.
> --
> Michael
>
| Attachment | Content-Type | Size |
|---|---|---|
| benchmark_toast_compression.sql | application/octet-stream | 26.2 KB |
| v3-0001-Add-ZSTD-TOAST-compression-using-VARTAG-ONDISK-ZSTD.patch | application/octet-stream | 52.1 KB |
| backwards_compatibility_test.sql | application/octet-stream | 13.8 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Robert Treat | 2025-12-24 16:50:48 | Re: Fwd: [PATCH] Add zstd compression for TOAST using extended header format |
| Previous Message | Michael Paquier | 2025-12-18 22:44:03 | Re: Fwd: [PATCH] Add zstd compression for TOAST using extended header format |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | David G. Johnston | 2025-12-24 00:52:37 | Re: Improve documentation of publication privilege checks |
| Previous Message | Bruce Momjian | 2025-12-24 00:25:18 | Re: [PATCH] Add enable_copy_program GUC to control COPY PROGRAM |