Re: [HACKERS] Custom compression methods

From: Ildar Musin <i(dot)musin(at)postgrespro(dot)ru>
To: Ildus Kurbangaliev <i(dot)kurbangaliev(at)postgrespro(dot)ru>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Евгений Шишкин <itparanoia(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Oleg Bartunov <obartunov(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Chapman Flack <chap(at)anastigmatix(dot)net>
Subject: Re: [HACKERS] Custom compression methods
Date: 2018-01-26 16:07:28
Message-ID: bc430af2-4388-4ad3-acda-f1a95b9310ae@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello Ildus,

I continue reviewing your patch. Here are some thoughts.

1. When I set column storage to EXTERNAL then I cannot set compression.
Seems reasonable:
create table test(id serial, msg text);
alter table test alter column msg set storage external;
alter table test alter column msg set compression pg_lz4;
ERROR: storage for "msg" should be MAIN or EXTENDED

But if I reorder commands then it's ok:
create table test(id serial, msg text);
alter table test alter column msg set compression pg_lz4;
alter table test alter column msg set storage external;
\d+ test
Table "public.test"
Column | Type | ... | Storage | Compression
--------+---------+ ... +----------+-------------
id | integer | ... | plain |
msg | text | ... | external | pg_lz4

So we could either allow user to set compression settings even when
storage is EXTERNAL but with warning or prohibit user to set compression
and external storage at the same time. The same thing is with setting
storage PLAIN.

2. I think TOAST_COMPRESS_SET_RAWSIZE macro could be rewritten like
following to prevent overwriting of higher bits of 'info':

((toast_compress_header *) (ptr))->info = \
((toast_compress_header *) (ptr))->info & ~RAWSIZEMASK | (len);

It maybe does not matter at the moment since it is only used once, but
it could save some efforts for other developers in future.
In TOAST_COMPRESS_SET_CUSTOM() instead of changing individual bits you
may do something like this:

#define TOAST_COMPRESS_SET_CUSTOM(ptr) \
do { \
((toast_compress_header *) (ptr))->info = \
((toast_compress_header *) (ptr))->info & RAWSIZEMASK | ((uint32) 0x02
<< 30) \
} while (0)

Also it would be nice if bit flags were explained and maybe replaced by
a macro.

3. In AlteredTableInfo, BulkInsertStateData and some functions (eg
toast_insert_or_update) there is a hash table used to keep preserved
compression methods list per attribute. I think a simple array of List*
would be sufficient in this case.

4. In optionListToArray() you can use list_qsort() to sort options list
instead of converting it manually into array and then back to a list.

5. Redundunt #includes:

In heap.c:
#include "access/reloptions.h"
In tsvector.c:
#include "catalog/pg_type.h"
#include "common/pg_lzcompress.h"
In relcache.c:
#include "utils/datum.h"

6. Just a minor thing: no reason to change formatting in copy.c
- heap_insert(resultRelInfo->ri_RelationDesc, tuple, mycid,
- hi_options, bistate);
+ heap_insert(resultRelInfo->ri_RelationDesc, tuple,
+ mycid, hi_options, bistate);

7. Also in utility.c the extra new line was added which isn't relevant
for this patch.

8. In parse_utilcmd.h the 'extern' keyword was removed from
transformRuleStmt declaration which doesn't make sense in this patch.

9. Comments. Again, they should be read by a native speaker. So just a
few suggestions:
toast_prepare_varlena() - comment needed
invalidate_amoptions_cache() - comment format doesn't match other
functions in the file

In htup_details.h:
/* tuple contain custom compressed
* varlenas */
should be "contains"

--
Ildar Musin
i(dot)musin(at)postgrespro(dot)ru

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephen Frost 2018-01-26 16:09:25 Re: [HACKERS] Patch: Add --no-comments to skip COMMENTs with pg_dump
Previous Message Stephen Frost 2018-01-26 16:07:25 Re: Boolean partitions syntax