Re: [HACKERS] Custom compression methods

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: Ildus Kurbangaliev <i(dot)kurbangaliev(at)postgrespro(dot)ru>, pgsql-hackers(at)postgresql(dot)org, Ildar Musin <i(dot)musin(at)postgrespro(dot)ru>
Subject: Re: [HACKERS] Custom compression methods
Date: 2017-12-01 21:19:26
Message-ID: f83ad2f4-697f-11cb-25a1-7b8ace1d2918@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


On 12/01/2017 08:38 PM, Alvaro Herrera wrote:
> Tomas Vondra wrote:
>
>> On 11/30/2017 09:51 PM, Alvaro Herrera wrote:
>
>>> Just passing by, but wouldn't this fit in the ACCESS METHOD group of
>>> commands? So this could be simplified down to
>>> CREATE ACCESS METHOD ts1 TYPE COMPRESSION
>>> we have that for indexes and there are patches flying for heap storage,
>>> sequences, etc.
>>
>> I think that would conflate two very different concepts. In my mind,
>> access methods define how rows are stored.
>
> In mine, they define how things are accessed (i.e. more general than
> what you're thinking). We *currently* use them to store rows [in
> indexes], but there is no reason why we couldn't expand that.
>

Not sure I follow. My argument was not as much about whether the rows
are stored as rows or in some other (columnar) format, but that access
methods deal with "tuples" (i.e. row in the "logical" way). I assume
that even if we end up implementing other access method types, they will
still be tuple-based.

OTOH compression methods (at least as introduced by this patch) operate
on individual values, and have very little to do with access to the
value (in a sense it's a transparent thing).

>
> So we group access methods in "types"; the current type we have is for
> indexes, and methods in that type define how are indexes accessed. This
> new type would indicate how would values be compressed. I disagree that
> there is no parallel there.
>
> I'm trying to avoid pointless proliferation of narrowly defined DDL
> commands.
>

Of course, the opposite case is using the same DDL for very different
concepts (although I understand you don't see it that way).

But in fairness, I don't really care if we call this COMPRESSION METHOD
or ACCESS METHOD or DARTH VADER ...

>> Furthermore, the "TYPE" in CREATE COMPRESSION method was meant to
>> restrict the compression algorithm to a particular data type (so, if it
>> relies on tsvector, you can't apply it to text columns).
>
> Yes, of course. I'm saying that the "datatype" property of a
> compression access method would be declared somewhere else, not in the
> TYPE clause of the CREATE ACCESS METHOD command. Perhaps it makes sense
> to declare that a certain compression access method is good only for a
> certain data type, and then you can put that in the options clause,
> "CREATE ACCESS METHOD hyperz TYPE COMPRESSION WITH (type = tsvector)".
> But many compression access methods would be general in nature and so
> could be used for many datatypes (say, snappy).
>
> To me it makes sense to say "let's create this method which is for data
> compression" (CREATE ACCESS METHOD hyperz TYPE COMPRESSION) followed by
> either "let's use this new compression method for the type tsvector"
> (ALTER TYPE tsvector SET COMPRESSION hyperz) or "let's use this new
> compression method for the column tc" (ALTER TABLE ALTER COLUMN tc SET
> COMPRESSION hyperz).
>

The WITH syntax does not seem particularly pretty to me, TBH. I'd be
much happier with "TYPE tsvector" and leaving WITH for the options
specific to each compression method.

FWIW I think syntax is the least critical part of this patch. It's ~1%
of the patch, and the gram.y additions are rather trivial.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2017-12-01 21:20:44 Re: Protect syscache from bloating with negative cache entries
Previous Message Bossart, Nathan 2017-12-01 21:16:30 Re: BUG #14941: Vacuum crashes