Re: Custom compression methods

From: Ildus Kurbangaliev <i(dot)kurbangaliev(at)postgrespro(dot)ru>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Custom compression methods
Date: 2017-09-12 14:55:05
Message-ID: 20170912175505.4afa11fd@wp.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, 7 Sep 2017 19:42:36 +0300
Ildus Kurbangaliev <i(dot)kurbangaliev(at)postgrespro(dot)ru> wrote:

> Hello hackers!
>
> I've attached a patch that implements custom compression
> methods. This patch is based on Nikita Glukhov's code (which he hasn't
> publish in mailing lists) for jsonb compression. This is early but
> working version of the patch, and there are still few fixes and
> features that should be implemented (like pg_dump support and support
> of compression options for types), and it requires more testing. But
> I'd like to get some feedback at the current stage first.
>
> There's been a proposal [1] of Alexander Korotkov and some discussion
> about custom compression methods before. This is an implementation of
> per-datum compression. Syntax is similar to the one in proposal but
> not the same.
>
> Syntax:
>
> CREATE COMPRESSION METHOD <cmname> HANDLER <compression_handler>;
> DROP COMPRESSION METHOD <cmname>;
>
> Compression handler is a function that returns a structure containing
> compression routines:
>
> - configure - function called when the compression method applied to
> an attribute
> - drop - called when the compression method is removed from an
> attribute
> - compress - compress function
> - decompress - decompress function
>
> User can create compressed columns with the commands below:
>
> CREATE TABLE t(a tsvector COMPRESSED <cmname> WITH <options>);
> ALTER TABLE t ALTER COLUMN a SET COMPRESSED <cmname> WITH <options>;
> ALTER TABLE t ALTER COLUMN a SET NOT COMPRESSED;
>
> Also there is syntax of binding compression methods to types:
>
> ALTER TYPE <type> SET COMPRESSED <cmname>;
> ALTER TYPE <type> SET NOT COMPRESSED;
>
> There are two new tables in the catalog, pg_compression and
> pg_compression_opt. pg_compression is used as storage of compression
> methods, and pg_compression_opt is used to store specific compression
> options for particular column.
>
> When user binds a compression method to some column a new record in
> pg_compression_opt is created and all further attribute values will
> contain compression options Oid while old values will remain
> unchanged. And when we alter a compression method for
> the attribute it won't change previous record in pg_compression_opt.
> Instead it'll create a new one and new values will be stored
> with new Oid. That way there is no need of recompression of the old
> tuples. And also tuples containing compressed datums can be copied to
> other tables so records in pg_compression_opt shouldn't be removed. In
> the current patch they can be removed with DROP COMPRESSION METHOD
> CASCADE, but after that decompression won't be possible on compressed
> tuples. Maybe CASCADE should keep compression options.
>
> I haven't changed the base logic of working with compressed datums. It
> means that custom compressed datums behave exactly the same as current
> LZ compressed datums, and the logic differs only in
> toast_compress_datum and toast_decompress_datum.
>
> This patch doesn't break backward compability and should work
> seamlessly with older version of database. I used one of two free
> bits in `va_rawsize` from `varattrib_4b->va_compressed` as flag of
> custom compressed datums. Also I renamed it to `va_info` since it
> contains not only rawsize now.
>
> The patch also includes custom compression method for tsvector which
> is used in tests.
>
> [1]
> https://www.postgresql.org/message-id/CAPpHfdsdTA5uZeq6MNXL5ZRuNx%2BSig4ykWzWEAfkC6ZKMDy6%3DQ%40mail.gmail.com

Attached rebased version of the patch. Added support of pg_dump, the
code was simplified, and a separate cache for compression options was
added.

--
---
Ildus Kurbangaliev
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company

Attachment Content-Type Size
custom_compression_methods_v2.patch text/x-patch 314.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andreas Karlsson 2017-09-12 14:55:26 Re: Patches that don't apply or don't compile: 2017-09-12
Previous Message Chris Travers 2017-09-12 14:52:48 pg_rewind proposed scope and interface changes