Re: [HACKERS] Custom compression methods

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: Ildus Kurbangaliev <i(dot)kurbangaliev(at)postgrespro(dot)ru>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] Custom compression methods
Date: 2017-12-12 21:33:48
Message-ID: CA+Tgmoax3Hz3ZLupxXofurFbpgKBEwe8qd7N65C7G6xoE=xQTw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Dec 11, 2017 at 2:53 PM, Tomas Vondra
<tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
> But let me play the devil's advocate for a while and question the
> usefulness of this approach to compression. Some of the questions were
> mentioned in the thread before, but I don't think they got the attention
> they deserve.

Sure, thanks for chiming in. I think it is good to make sure we are
discussing this stuff.

> But perhaps we should simply make it an initdb option (in which case the
> whole cluster would simply use e.g. lz4 instead of pglz)?
>
> That seems like a much simpler approach - it would only require some
> ./configure options to add --with-lz4 (and other compression libraries),
> an initdb option to pick compression algorithm, and probably noting the
> choice in cluster controldata.
>
> No dependencies tracking, no ALTER TABLE issues, etc.
>
> Of course, it would not allow using different compression algorithms for
> different columns (although it might perhaps allow different compression
> level, to some extent).
>
> Conclusion: If we want to offer a simple cluster-wide pglz alternative,
> perhaps this patch is not the right way to do that.

I actually disagree with your conclusion here. I mean, if you do it
that way, then it has the same problem as checksums: changing
compression algorithms requires a full dump-and-reload of the
database, which makes it more or less a non-starter for large
databases. On the other hand, with the infrastructure provided by
this patch, we can have a default_compression_method GUC that will be
set to 'pglz' initially. If the user changes it to 'lz4', or we ship
a new release where the new default is 'lz4', then new tables created
will use that new setting, but the existing stuff keeps working. If
you want to upgrade your existing tables to use lz4 rather than pglz,
you can change the compression option for those columns to COMPRESS
lz4 PRESERVE pglz if you want to do it incrementally or just COMPRESS
lz4 to force a rewrite of an individual table. That's really
powerful, and I think users will like it a lot.

In short, your approach, while perhaps a little simpler to code, seems
like it is fraught with operational problems which this design avoids.

> Custom datatype-aware compression (e.g. the tsvector)
> ----------------------------------------------------------------------
>
> Exploiting knowledge of the internal data type structure is a promising
> way to improve compression ratio and/or performance.
>
> The obvious question of course is why shouldn't this be done by the data
> type code directly, which would also allow additional benefits like
> operating directly on the compressed values.
>
> Another thing is that if the datatype representation changes in some
> way, the compression method has to change too. So it's tightly coupled
> to the datatype anyway.
>
> This does not really require any new infrastructure, all the pieces are
> already there.
>
> In some cases that may not be quite possible - the datatype may not be
> flexible enough to support alternative (compressed) representation, e.g.
> because there are no bits available for "compressed" flag, etc.
>
> Conclusion: IMHO if we want to exploit the knowledge of the data type
> internal structure, perhaps doing that in the datatype code directly
> would be a better choice.

I definitely think there's a place for compression built right into
the data type. I'm still happy about commit
145343534c153d1e6c3cff1fa1855787684d9a38 -- although really, more
needs to be done there. But that type of improvement and what is
proposed here are basically orthogonal. Having either one is good;
having both is better.

I think there may also be a place for declaring that a particular data
type has a "privileged" type of TOAST compression; if you use that
kind of compression for that data type, the data type will do smart
things, and if not, it will have to decompress in more cases. But I
think this infrastructure makes that kind of thing easier, not harder.

> Custom datatype-aware compression with additional column-specific
> metadata (e.g. the jsonb with external dictionary).
> ----------------------------------------------------------------------
>
> Exploiting redundancy in multiple values in the same column (instead of
> compressing them independently) is another attractive way to help the
> compression. It is inherently datatype-aware, but currently can't be
> implemented directly in datatype code as there's no concept of
> column-specific storage (e.g. to store dictionary shared by all values
> in a particular column).
>
> I believe any patch addressing this use case would have to introduce
> such column-specific storage, and any solution doing that would probably
> need to introduce the same catalogs, etc.
>
> The obvious disadvantage of course is that we need to decompress the
> varlena value before doing pretty much anything with it, because the
> datatype is not aware of the compression.
>
> So I wonder if the patch should instead provide infrastructure for doing
> that in the datatype code directly.
>
> The other question is if the patch should introduce some infrastructure
> for handling the column context (e.g. column dictionary). Right now,
> whoever implements the compression has to implement this bit too.

I agree that having a place to store a per-column compression
dictionary would be awesome, but I think that could be added later on
top of this infrastructure. For example, suppose we stored each
per-column compression dictionary in a separate file and provided some
infrastructure for WAL-logging changes to the file on a logical basis
and checkpointing those updates. Then we wouldn't be tied to the
MVCC/transactional issues which storing the blobs in a table would
have, which seems like a big win. Of course, it also creates a lot of
little tiny files inside a directory that already tends to have too
many files, but maybe with some more work we can figure out a way
around that problem. Here again, it seems to me that the proposed
design is going more in the right direction than the wrong direction:
if some day we have per-column dictionaries, they will need to be tied
to specific compression methods on specific columns. If we already
have that concept, extending it to do something new is easier than if
we have to create it from scratch.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2017-12-12 21:34:31 Re: CUBE seems a bit confused about ORDER BY
Previous Message Tom Lane 2017-12-12 21:19:12 Re: Rethinking MemoryContext creation