Re: [HACKERS] Custom compression methods

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Justin Pryzby <pryzby(at)telsasoft(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, David Steele <david(at)pgmasters(dot)net>, Ildus Kurbangaliev <i(dot)kurbangaliev(at)gmail(dot)com>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: [HACKERS] Custom compression methods
Date: 2021-03-10 11:51:45
Message-ID: CAFiTN-sBF8ZK3XP105AybsH5s5KtyWoRNbYzgPXQOKACdn1a_Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 9, 2021 at 1:56 AM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> Some other review comments:

I have worked on these review comments. Please find my response inline
>
> toast_get_compression_method() should now return char, not Oid.

Fixed

> With this design, we can support changing the compression method on a
> column quite easily. It's just a hint, like the STORAGE parameter. It
> has no bearing on what can be present in the table, but just controls
> how new values are stored. It would be nice to have a way to force
> anything compressed with the old method to be re-compressed with the
> new method, but not having that doesn't preclude allowing the
> parameter to be changed.

As responded upthread, as of now I am planning to provide a syntax as

ALTER COLUMN col SET COMPRESSION method REWRITE, if user wants to
rewrite the table.

> I am tempted to propose that we collapse compress_lz4.c and
> compress_pglz.c into a single file, get rid of the directory, and just
> have something like src/backend/access/common/toast_compression.c. The
> files are awfully short, and making a whole new directory for that
> small amount of code seems like overkill.

I have done that, along with that I have also renamed compressapi.h to
toast_compression.h, and along with that I have done some more
refactoring of the code especially in toast_compression.c and
toast_compression.h, please have a look.

> I think the pg_dump argument should be --no-toast-compression, not
> --no-toast-compressions.

Done

I agree with Justin that pg_restore should
> have the option also.

Not done anything for pg_restore as we already agreed upon this.

> Man, it would be really nice to be able to set the default for new
> tables, rather than having all these places hard-coded to use
> DefaultCompressionMethod. Surely lotsa people are going to want to set
> toast_compression = lz4 in postgresql.conf and forget about it.

As Justine pointed out we are doing in 0002, maybe we should merge
0001 and 0002 but I kept is that way so that the review can be easy.

> Is there any reason not to change varattrib_4b's description of
> va_tcinfo that says "and flags" to instead say "and compression
> method"? And rename VARFLAGS_4B_C to VARCOMPRESS_4B_C? I don't know
> why we should call it flags when we know it's specifically compression
> information.

Done.

> You should probably have a test that involves altering the type of a
> varlena column to non-varlena, and the other way around, and make sure
> that changing integer -> text sets attcompression and doing the
> reverse clears it.

Done, also added the test case to see that setting the storage type to
plain on varlena type should not clear the compression method.

> You need to update catalogs.sgml.

Done

> On the whole I don't see a whole lot to complain about here. I don't
> love giving up on the idea of tracking which compression methods are
> used where, but making that work without performance regressions seems
> very difficult and perhaps just outright impossible, and dealing with
> all the concurrency problems that introduces is a pain, too. I think
> accepting a feature that gives us LZ4 compression is better than
> rejecting it because we can't solve those problems.

Right.

Apart from this I have also fixed the comment given by Justin.

The pending comment is providing a way to rewrite a table and
re-compress the data with the current compression method.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Attachment Content-Type Size
v33-0004-default-to-with-lz4.patch text/x-patch 1.7 KB
v33-0002-Add-default_toast_compression-GUC.patch text/x-patch 11.4 KB
v33-0003-Alter-table-set-compression.patch text/x-patch 20.2 KB
v33-0001-Built-in-compression-method.patch text/x-patch 96.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Euler Taveira 2021-03-10 12:17:54 Re: Logical Replication - improve error message while adding tables to the publication in check_publication_add_relation
Previous Message houzj.fnst@fujitsu.com 2021-03-10 11:39:48 RE: Parallel Inserts in CREATE TABLE AS