Re: Zstandard support for toast compression

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Nikolay Shaplov <dhyan(at)nataraj(dot)su>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org, Robert Haas <robertmhaas(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Zstandard support for toast compression
Date: 2022-05-20 20:17:42
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


* Nikolay Shaplov (dhyan(at)nataraj(dot)su) wrote:
> В письме от вторник, 17 мая 2022 г. 23:01:07 MSK пользователь Tom Lane
> написал:
> Hi! I came to this branch looking for a patch to review, but I guess I would
> join the discussion instead of reading the code.

Seems that's what would be helpful now thanks for joining the

> > >> Yeah - I think we had better reserve the fourth bit pattern for
> > >> something extensible e.g. another byte or several to specify the
> > >> actual method, so that we don't have a hard limit of 4 methods. But
> > >> even with such a system, the first 3 methods will always and forever
> > >> be privileged over all others, so we'd better not make the mistake of
> > >> adding something silly as our third algorithm.
> > >
> > > In such a situation, would they really end up being properly distinct
> > > when it comes to what our users see..? I wouldn't really think so.
> >
> > It should be transparent to users, sure, but the point is that the
> > first three methods will have a storage space advantage over others.
> > Plus we'd have to do some actual work to create that extension mechanism.
> Postgres is well known for extensiblility. One can write your own
> implementation of almost everything and make it an extension.
> Though one would hardly need more than one (or two) additional compression
> methods, but which method one will really need is hard to say.

A thought I've had before is that it'd be nice to specify a particular
compression method on a data type basis. Wasn't the direction that this
was taken, for reasons, but I wonder about perhaps still having a data
type compression method and perhaps one of these bits might be "the data
type's (default?) compression method". Even so though, having an
extensible way to add new compression methods would be a good thing.

For compression methods that we already support in other parts of the
system, seems clear that we should allow those to be used for column
compression too. We should certainly also support a way to specify on a
compression-type specific level what the compression level should be

> So I guess it would be much better to create and API for creating and
> registering own compression method and create build in Zstd compression method
> that can be used (or optionally not used) via that API.

While I generally agree that we want to provide extensibility in this
area, given that we already have zstd as a compile-time option and it
exists in other parts of the system, I don't think it makes sense to
require users to install an extension to use it.

> Moreover I guess this API (may be with some modification) can be used for
> seamless data encryption, for example.

Perhaps.. but this kind of encryption wouldn't allow indexing and
certainly lots of other metadata would still be unencrypted (the entire
system catalog being a good example..).



In response to


Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2022-05-20 20:33:02 Re: check for null value before looking up the hash function
Previous Message Stephen Frost 2022-05-20 19:52:25 Re: Inquiring about my GSoC Proposal.