Re: [HACKERS] Custom compression methods

From: Justin Pryzby <pryzby(at)telsasoft(dot)com>
To: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, David Steele <david(at)pgmasters(dot)net>, Ildus Kurbangaliev <i(dot)kurbangaliev(at)gmail(dot)com>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: [HACKERS] Custom compression methods
Date: 2021-03-06 19:57:51
Message-ID: 20210306195751.GW29832@telsasoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Mar 06, 2021 at 08:59:16PM +0530, Dilip Kumar wrote:
> - Alter table set compression, will not rewrite the old data, so only
> the new tuple will be compressed with the new compression method.
> - No preserve.

+1, this simplifies things. If someone *wants* to rewrite the table, they can
VACUUM FULL, CLUSTER, or dump+restore.

I checked that it's possible to do simple column manipulations on columns
written with-lz4 with binaries built without-lz4:
- is null
- texteq if length differs
- explain analyze

If I pg_upgrade from an binary with-lz4 to one without-lz4, it fails
while restoring the schema, after running check, which is bad:
| pg_restore: error: could not execute query: ERROR: not built with lz4 support
|CREATE TABLE "public"."a" (
| "t" "text" COMPRESSION lz4,

For comparison, upgrading from binaries with-libxml to binaries without-libxml
actualy passes pg_upgrade.

It's arguable which behavior is desirable:
- allow CREATE TABLE(..COMPRESSION lz4) during pg_upgrade;
- allow CREATE TABLE(..COMPRESSION lz4) always. This has the advantage that
GetAttributeCompression() doesn't have conditional compilation. This seems
to be parallel to the libxml case - apparently, it's possible to create an
XML column, but not insert into it.
- abort pg_upgrade --check if the old cluster has lz4 and the new one doesn't,
if there are any lz4 compressed columns. This avoids the possibilty of
running an upgrade to binaries without lz4, starting a new cluster (which
leaves the old cluster unsafe to start if --link was used), and then the new
cluster may even appear to work, until an LZ4 column is accessed in a
nontrivial way. It has the disadvantage that there's no obvious parallel in
pg_upgrade (checksums and xml are the closest?). And the disadvantage that
some people might *want* the upgrade to succeed in that case to then recompile
with lz4 afterwards.

In this patch, SET default_toast_compression=lz4 "works" even if without-lz4,
but then CREATE TABLE fails. You should either allow table creation (as
above), or check in check_default_toast_compression() if lz4 is enabled.
Its comment about "catalog access" is incorrect now.

Now, I wonder if default_toast_compression should be a GUC, or a reloption.
An obvious advantage of being a GUC is that regression tests are trivial with
make installcheck.

Some minor fixes:

+ if (strcmp(def->compression, newdef->compression))
!= 0

+ * NULL for non varlena type or the uncompressed data.
remove "the"

+ * InvalidOid for the plain/external storage otherwise default
remove "the"

+ behavior is to exclude compression methods, resulting in the columns
remove "the"

+ attcompression and attstorage for the respective index attribute if ... the respective input values are
say "and/or attstorage"

+ If this variable is set to <literal>true</literal>, column's [...]
I wrote this, but I guess it should say: columns'

I think you could also say either of these:
.. column compression method details are not displayed.
.. details of column compression are not displayed.

--
Justin

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Fetter 2021-03-06 20:00:24 Re: [patch] bit XOR aggregate functions
Previous Message Vik Fearing 2021-03-06 19:57:46 Re: [patch] bit XOR aggregate functions