|From:||Ildus Kurbangaliev <i(dot)kurbangaliev(at)postgrespro(dot)ru>|
|Subject:||Re: Custom compression methods|
|Views:||Raw Message | Whole Thread | Download mbox | Resend email|
On Thu, 7 Sep 2017 19:42:36 +0300
Ildus Kurbangaliev <i(dot)kurbangaliev(at)postgrespro(dot)ru> wrote:
> Hello hackers!
> I've attached a patch that implements custom compression
> methods. This patch is based on Nikita Glukhov's code (which he hasn't
> publish in mailing lists) for jsonb compression. This is early but
> working version of the patch, and there are still few fixes and
> features that should be implemented (like pg_dump support and support
> of compression options for types), and it requires more testing. But
> I'd like to get some feedback at the current stage first.
> There's been a proposal  of Alexander Korotkov and some discussion
> about custom compression methods before. This is an implementation of
> per-datum compression. Syntax is similar to the one in proposal but
> not the same.
> CREATE COMPRESSION METHOD <cmname> HANDLER <compression_handler>;
> DROP COMPRESSION METHOD <cmname>;
> Compression handler is a function that returns a structure containing
> compression routines:
> - configure - function called when the compression method applied to
> an attribute
> - drop - called when the compression method is removed from an
> - compress - compress function
> - decompress - decompress function
> User can create compressed columns with the commands below:
> CREATE TABLE t(a tsvector COMPRESSED <cmname> WITH <options>);
> ALTER TABLE t ALTER COLUMN a SET COMPRESSED <cmname> WITH <options>;
> ALTER TABLE t ALTER COLUMN a SET NOT COMPRESSED;
> Also there is syntax of binding compression methods to types:
> ALTER TYPE <type> SET COMPRESSED <cmname>;
> ALTER TYPE <type> SET NOT COMPRESSED;
> There are two new tables in the catalog, pg_compression and
> pg_compression_opt. pg_compression is used as storage of compression
> methods, and pg_compression_opt is used to store specific compression
> options for particular column.
> When user binds a compression method to some column a new record in
> pg_compression_opt is created and all further attribute values will
> contain compression options Oid while old values will remain
> unchanged. And when we alter a compression method for
> the attribute it won't change previous record in pg_compression_opt.
> Instead it'll create a new one and new values will be stored
> with new Oid. That way there is no need of recompression of the old
> tuples. And also tuples containing compressed datums can be copied to
> other tables so records in pg_compression_opt shouldn't be removed. In
> the current patch they can be removed with DROP COMPRESSION METHOD
> CASCADE, but after that decompression won't be possible on compressed
> tuples. Maybe CASCADE should keep compression options.
> I haven't changed the base logic of working with compressed datums. It
> means that custom compressed datums behave exactly the same as current
> LZ compressed datums, and the logic differs only in
> toast_compress_datum and toast_decompress_datum.
> This patch doesn't break backward compability and should work
> seamlessly with older version of database. I used one of two free
> bits in `va_rawsize` from `varattrib_4b->va_compressed` as flag of
> custom compressed datums. Also I renamed it to `va_info` since it
> contains not only rawsize now.
> The patch also includes custom compression method for tsvector which
> is used in tests.
Attached rebased version of the patch. Added support of pg_dump, the
code was simplified, and a separate cache for compression options was
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
|Next Message||Andreas Karlsson||2017-09-12 14:55:26||Re: Patches that don't apply or don't compile: 2017-09-12|
|Previous Message||Chris Travers||2017-09-12 14:52:48||pg_rewind proposed scope and interface changes|