Re: [HACKERS] Custom compression methods

From: Ildus Kurbangaliev <i(dot)kurbangaliev(at)postgrespro(dot)ru>
To: Ildar Musin <i(dot)musin(at)postgrespro(dot)ru>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Евгений Шишкин <itparanoia(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Oleg Bartunov <obartunov(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Chapman Flack <chap(at)anastigmatix(dot)net>
Subject: Re: [HACKERS] Custom compression methods
Date: 2018-01-29 11:44:52
Message-ID: 20180129144452.08f04dc3@wp.localdomain
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-hackers

On Fri, 26 Jan 2018 19:07:28 +0300
Ildar Musin <i(dot)musin(at)postgrespro(dot)ru> wrote:

> Hello Ildus,
>
> I continue reviewing your patch. Here are some thoughts.

Thanks! Attached new version of the patch.

>
> 1. When I set column storage to EXTERNAL then I cannot set
> compression. Seems reasonable:
> create table test(id serial, msg text);
> alter table test alter column msg set storage external;
> alter table test alter column msg set compression pg_lz4;
> ERROR: storage for "msg" should be MAIN or EXTENDED

Changed the behaviour, now it's ok to change storages in any
directions for toastable types. Also added protection from untoastable
types.

>
>
> 2. I think TOAST_COMPRESS_SET_RAWSIZE macro could be rewritten like
> following to prevent overwriting of higher bits of 'info':
>
> ((toast_compress_header *) (ptr))->info = \
> ((toast_compress_header *) (ptr))->info & ~RAWSIZEMASK |
> (len);
>
> It maybe does not matter at the moment since it is only used once, but
> it could save some efforts for other developers in future.
> In TOAST_COMPRESS_SET_CUSTOM() instead of changing individual bits you
> may do something like this:
>
> #define TOAST_COMPRESS_SET_CUSTOM(ptr) \
> do { \
> ((toast_compress_header *) (ptr))->info = \
> ((toast_compress_header *) (ptr))->info & RAWSIZEMASK
> | ((uint32) 0x02 << 30) \
> } while (0)
>
> Also it would be nice if bit flags were explained and maybe replaced
> by a macro.

I noticed that there is no need of TOAST_COMPRESS_SET_CUSTOM at all,
so I just removed it, TOAST_COMPRESS_SET_RAWSIZE will set needed flags.

>
>
> 3. In AlteredTableInfo, BulkInsertStateData and some functions (eg
> toast_insert_or_update) there is a hash table used to keep preserved
> compression methods list per attribute. I think a simple array of
> List* would be sufficient in this case.

Not sure about that, it will just complicate things without sufficient
improvements. Also it would require the passing the length of the array
and require more memory for tables with large number of attributes. But,
I've made default size of the hash table smaller, since unlikely the
user will change compression of many attributes at once.

>
>
> 4. In optionListToArray() you can use list_qsort() to sort options
> list instead of converting it manually into array and then back to a
> list.

Good, didn't know about this function.

>
>
> 5. Redundunt #includes:
>
> In heap.c:
> #include "access/reloptions.h"
> In tsvector.c:
> #include "catalog/pg_type.h"
> #include "common/pg_lzcompress.h"
> In relcache.c:
> #include "utils/datum.h"
>
> 6. Just a minor thing: no reason to change formatting in copy.c
> - heap_insert(resultRelInfo->ri_RelationDesc, tuple, mycid,
> - hi_options, bistate);
> + heap_insert(resultRelInfo->ri_RelationDesc, tuple,
> + mycid, hi_options, bistate);
>
> 7. Also in utility.c the extra new line was added which isn't
> relevant for this patch.
>
> 8. In parse_utilcmd.h the 'extern' keyword was removed from
> transformRuleStmt declaration which doesn't make sense in this patch.
>
> 9. Comments. Again, they should be read by a native speaker. So just
> a few suggestions:
> toast_prepare_varlena() - comment needed
> invalidate_amoptions_cache() - comment format doesn't match other
> functions in the file
>
> In htup_details.h:
> /* tuple contain custom compressed
> * varlenas */
> should be "contains"
>

5-9, all done. Thank you for noticing.

--
---
Ildus Kurbangaliev
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company

Attachment Content-Type Size
custom_compression_methods_v11.patch text/x-patch 325.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fabien COELHO 2018-01-29 12:03:16 Re: General purpose hashing func in pgbench
Previous Message Yugo Nagata 2018-01-29 11:25:50 Re: [HACKERS] [PATCH] Lockable views