Re: [HACKERS] Custom compression methods

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Cc: Justin Pryzby <pryzby(at)telsasoft(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, David Steele <david(at)pgmasters(dot)net>, Ildus Kurbangaliev <i(dot)kurbangaliev(at)gmail(dot)com>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: [HACKERS] Custom compression methods
Date: 2021-02-10 21:56:17
Message-ID: CA+TgmoZ9Rap=NuJLSvw2u=9fibUdB2L_Cs2Wb8FD49vhiXsdpQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Feb 10, 2021 at 3:06 PM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> I think you have a fairly big problem with row types. Consider this example:
>
> create table t1 (a int, b text compression pglz);
> create table t2 (a int, b text compression lz4);
> create table t3 (x t1);
> insert into t1 values (1, repeat('foo', 1000));
> insert into t2 values (1, repeat('foo', 1000));
> insert into t3 select t1 from t1;
> insert into t3 select row(a, b)::t1 from t2;
>
> rhaas=# select pg_column_compression((t3.x).b) from t3;
> pg_column_compression
> -----------------------
> pglz
> lz4
> (2 rows)
>
> That's not good, because now

...because now I hit send too soon. Also, because now column b has
implicit dependencies on both compression AMs and the rest of the
system has no idea. I think we probably should have a rule that
nothing except pglz is allowed inside of a record, just to keep it
simple. The record overall can be toasted so it's not clear why we
should also be toasting the original columns at all, but I think
precedent probably argues for continuing to allow PGLZ, as it can
already be like that on disk and pg_upgrade is a thing. The same kind
of issue probably exists for arrays and range types.

I poked around a bit trying to find ways of getting data compressed
with one compression method into a column that was marked with another
compression method, but wasn't able to break it.

In CompareCompressionMethodAndDecompress, I think this is still
playing a bit fast and loose with the rules around slots. I think we
can do better. Suppose that at the point where we discover that we
need to decompress at least one attribute, we create the new slot
right then, and also memcpy tts_values and tts_isnull. Then, for that
attribute and any future attributes that need decompression, we reset
tts_values in the *new* slot, leaving the old one untouched. Then,
after finishing all the attributes, the if (decompressed_any) block,
you just have a lot less stuff to do. The advantage of this is that
you haven't tainted the old slot; it's still got whatever contents it
had before, and is in a clean state, which seems better to me.

It's unclear to me whether this function actually needs to
ExecMaterializeSlot(newslot). It definitely does need to
ExecStoreVirtualTuple(newslot) and I think it's a very good idea, if
not absolutely mandatory, for it not to modify anything about the old
slot. But what's the argument that the new slot needs to be
materialized at this point? It may be needed, if the old slot would've
had to be materialized at this point. But it's something to think
about.

The CREATE TABLE documentation says that COMPRESSION is a kind of
column constraint, but that's wrong. For example, you can't write
CREATE TABLE a (b int4 CONSTRAINT thunk COMPRESSION lz4), for example,
contrary to what the syntax summary implies. When you fix this so that
the documentation matches the grammar change, you may also need to
move the longer description further up in create_table.sgml so the
order matches.

The use of VARHDRSZ_COMPRESS in toast_get_compression_oid() appears to
be incorrect. VARHDRSZ_COMPRESS is offsetof(varattrib_4b,
va_compressed.va_data). But what gets externalized in the case of a
compressed datum is just VARDATA(dval), which excludes the length
word, unlike VARHDRSZ_COMPRESS, which does not. This has no
consequences since we're only going to fetch 1 chunk either way, but I
think we should make it correct.

TOAST_COMPRESS_SET_SIZE_AND_METHOD() could Assert something about cm_method.

Small delta patch with a few other suggested changes attached.

--
Robert Haas
EDB: http://www.enterprisedb.com

Attachment Content-Type Size
fixups.patch application/octet-stream 5.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephen Frost 2021-02-10 22:10:08 Re: automatic analyze: readahead - add "IO read time" log message
Previous Message Stephen Frost 2021-02-10 21:50:33 Re: WIP: WAL prefetch (another approach)