|From:||Ildus Kurbangaliev <i(dot)kurbangaliev(at)postgrespro(dot)ru>|
|Subject:||Custom compression methods|
|Views:||Raw Message | Whole Thread | Download mbox | Resend email|
I've attached a patch that implements custom compression
methods. This patch is based on Nikita Glukhov's code (which he hasn't
publish in mailing lists) for jsonb compression. This is early but
working version of the patch, and there are still few fixes and features
that should be implemented (like pg_dump support and support of
compression options for types), and it requires more testing. But I'd
like to get some feedback at the current stage first.
There's been a proposal  of Alexander Korotkov and some discussion
about custom compression methods before. This is an implementation of
per-datum compression. Syntax is similar to the one in proposal but not
CREATE COMPRESSION METHOD <cmname> HANDLER <compression_handler>;
DROP COMPRESSION METHOD <cmname>;
Compression handler is a function that returns a structure containing
- configure - function called when the compression method applied to
- drop - called when the compression method is removed from an attribute
- compress - compress function
- decompress - decompress function
User can create compressed columns with the commands below:
CREATE TABLE t(a tsvector COMPRESSED <cmname> WITH <options>);
ALTER TABLE t ALTER COLUMN a SET COMPRESSED <cmname> WITH <options>;
ALTER TABLE t ALTER COLUMN a SET NOT COMPRESSED;
Also there is syntax of binding compression methods to types:
ALTER TYPE <type> SET COMPRESSED <cmname>;
ALTER TYPE <type> SET NOT COMPRESSED;
There are two new tables in the catalog, pg_compression and
pg_compression_opt. pg_compression is used as storage of compression
methods, and pg_compression_opt is used to store specific compression
options for particular column.
When user binds a compression method to some column a new record in
pg_compression_opt is created and all further attribute values will
contain compression options Oid while old values will remain unchanged.
And when we alter a compression method for
the attribute it won't change previous record in pg_compression_opt.
Instead it'll create a new one and new values will be stored
with new Oid. That way there is no need of recompression of the old
tuples. And also tuples containing compressed datums can be copied to
other tables so records in pg_compression_opt shouldn't be removed. In
the current patch they can be removed with DROP COMPRESSION METHOD
CASCADE, but after that decompression won't be possible on compressed
tuples. Maybe CASCADE should keep compression options.
I haven't changed the base logic of working with compressed datums. It
means that custom compressed datums behave exactly the same as current
LZ compressed datums, and the logic differs only in toast_compress_datum
This patch doesn't break backward compability and should work seamlessly
with older version of database. I used one of two free bits in
`va_rawsize` from `varattrib_4b->va_compressed` as flag of custom
compressed datums. Also I renamed it to `va_info` since it contains not
only rawsize now.
The patch also includes custom compression method for tsvector which is
used in tests.
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
|Next Message||Pavel Stehule||2017-09-07 17:48:37||Re: Re: issue: record or row variable cannot be part of multiple-item INTO list|
|Previous Message||Konstantin Knizhnik||2017-09-07 15:58:31||Re: Secondary index access optimizations|