Re: [HACKERS] Custom compression methods

From: Ildar Musin <i(dot)musin(at)postgrespro(dot)ru>
To: Ildus Kurbangaliev <i(dot)kurbangaliev(at)postgrespro(dot)ru>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Евгений Шишкин <itparanoia(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Oleg Bartunov <obartunov(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Chapman Flack <chap(at)anastigmatix(dot)net>
Subject: Re: [HACKERS] Custom compression methods
Date: 2018-01-25 13:03:20
Message-ID: 6fd5dea3-8889-c7bb-9df2-79493b2b6ab0@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello Ildus,

On 23.01.2018 16:04, Ildus Kurbangaliev wrote:
> On Mon, 22 Jan 2018 23:26:31 +0300
> Ildar Musin <i(dot)musin(at)postgrespro(dot)ru> wrote:
>
> Thanks for review! Attached new version of the patch. Fixed few bugs,
> added more documentation and rebased to current master.
>
>> You need to rebase to the latest master, there are some conflicts.
>> I've applied it to the three days old master to try it.
>
> Done.
>
>>
>> As I can see the documentation is not yet complete. For example, there
>> is no section for ALTER COLUMN ... SET COMPRESSION in ddl.sgml; and
>> section "Compression Access Method Functions" in compression-am.sgml
>> hasn't been finished.
>
> Not sure about ddl.sgml, it contains more common things, but since
> postgres contains only pglz by default there is not much to show.
>
>>
>> I've implemented an extension [1] to understand the way developer
>> would go to work with new infrastructure. And for me it seems clear.
>> (Except that it took me some effort to wrap my mind around varlena
>> macros but it is probably a different topic).
>>
>> I noticed that you haven't cover 'cmdrop' in the regression tests and
>> I saw the previous discussion about it. Have you considered using
>> event triggers to handle the drop of column compression instead of
>> 'cmdrop' function? This way you would kill two birds with one stone:
>> it still provides sufficient infrastructure to catch those events
>> (and it something postgres already has for different kinds of ddl
>> commands) and it would be easier to test.
>
> I have added support for event triggers for ALTER SET COMPRESSION in
> current version. Event trigger on ALTER can be used to replace cmdrop
> function but it will be far from trivial. There is not easy way to
> understand that's attribute compression is really dropping in the
> command.
>

I've encountered unexpected behavior in command 'CREATE TABLE ... (LIKE
...)'. It seems that it copies compression settings of the table
attributes no matter which INCLUDING options are specified. E.g.

create table xxx(id serial, msg text compression pg_lz4);
alter table xxx alter column msg set storage external;
\d+ xxx
Table "public.xxx"
Column | Type | ... | Storage | Compression |
--------+---------+ ... +----------+-------------+
id | integer | ... | plain | |
msg | text | ... | external | pg_lz4 |

Now copy the table structure with "INCLUDING ALL":

create table yyy (like xxx including all);
\d+ yyy
Table "public.yyy"
Column | Type | ... | Storage | Compression |
--------+---------+ ... +----------+-------------+
id | integer | ... | plain | |
msg | text | ... | external | pg_lz4 |

And now copy without "INCLUDING ALL":

create table zzz (like xxx);
\d+ zzz
Table "public.zzz"
Column | Type | ... | Storage | Compression |
--------+---------+ ... +----------+-------------+
id | integer | ... | plain | |
msg | text | ... | extended | pg_lz4 |

As you see, compression option is copied anyway. I suggest adding new
INCLUDING COMPRESSION option to enable user to explicitly specify
whether they want or not to copy compression settings.

I found a few phrases in documentation that can be improved. But the
documentation should be checked by a native speaker.

In compression-am.sgml:
"an compression access method" -> "a compression access method"
"compression method method" -> "compression method"
"compability" -> "compatibility"
Probably "local-backend cached state" would be better to replace with
"per backend cached state"?
"Useful to store the parsed view of the compression options" -> "It
could be useful for example to cache compression options"
"and stores result of" -> "and stores the result of"
"Called when CompressionAmOptions is creating." -> "Called when
<structname>CompressionAmOptions</structname> is being initialized"

"Note that in any system cache invalidation related with
pg_attr_compression relation the options will be cleaned" -> "Note that
any <literal>pg_attr_compression</literal> relation invalidation will
cause all the cached <literal>acstate</literal> options cleared."
"Function used to ..." -> "Function is used to ..."

I think it would be nice to mention custom compression methods in
storage.sgml. At this moment it only mentions built-in pglz compression.

--
Ildar Musin
i(dot)musin(at)postgrespro(dot)ru

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Korotkov 2018-01-25 13:10:26 Re: [HACKERS] GSoC 2017 : Patch for predicate locking in Gist index
Previous Message David Rowley 2018-01-25 12:55:36 Re: [HACKERS] Removing [Merge]Append nodes which contain a single subpath