Re: [HACKERS] Custom compression methods

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Ildus Kurbangaliev <i(dot)kurbangaliev(at)postgrespro(dot)ru>, Andres Freund <andres(at)anarazel(dot)de>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Oleg Bartunov <obartunov(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] Custom compression methods
Date: 2017-11-20 15:18:30
Message-ID: 58471b21-2f8c-9fa9-63ca-4c37883b8307@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 11/20/2017 10:44 AM, Ildus Kurbangaliev wrote:
> On Mon, 20 Nov 2017 00:23:23 +0100
> Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
>
>> On 11/15/2017 02:13 PM, Robert Haas wrote:
>>> On Wed, Nov 15, 2017 at 4:09 AM, Ildus Kurbangaliev
>>> <i(dot)kurbangaliev(at)postgrespro(dot)ru> wrote:
>>>> So in the next version of the patch I can just unlink the options
>>>> from compression methods and dropping compression method will not
>>>> affect already compressed tuples. They still could be
>>>> decompressed.
>>>
>>> I guess I don't understand how that can work. I mean, if somebody
>>> removes a compression method - i.e. uninstalls the library - and you
>>> don't have a way to make sure there are no tuples that can only be
>>> uncompressed by that library - then you've broken the database.
>>> Ideally, there should be a way to add a new compression method via
>>> an extension ... and then get rid of it and all dependencies
>>> thereupon.
>>
>> I share your confusion. Once you do DROP COMPRESSION METHOD, there
>> must be no remaining data compressed with it. But that's what the
>> patch is doing already - it enforces this using dependencies, as
>> usual.
>>
>> Ildus, can you explain what you meant? How could the data still be
>> decompressed after DROP COMPRESSION METHOD, and possibly after
>> removing the .so library?
>
> The removal of the .so library will broke all compressed tuples. I
> don't see a way to avoid it. I meant that DROP COMPRESSION METHOD could
> remove the record from 'pg_compression' table, but actually the
> compressed tuple needs only a record from 'pg_compression_opt' where
> its options are located. And there is dependency between an extension
> and the options so you can't just remove the extension without CASCADE,
> postgres will complain.
>

I don't think we need to do anything smart here - it should behave just
like dropping a data type, for example. That is, error out if there are
columns using the compression method (without CASCADE), and drop all the
columns (with CASCADE).

Leaving around the pg_compression_opt is not a solution. Not only it's
confusing and I'm not aware about any extension because the user is
likely to remove the .so file (perhaps not directly, but e.g. by
removing the rpm package providing it).

> Still it's a problem if the user used for example `SELECT
> <compressed_column> INTO * FROM *` because postgres will copy compressed
> tuples, and there will not be any dependencies between destination and
> the options.
>

This seems like a rather fatal design flaw, though. I'd say we need to
force recompression of the data, in such cases. Otherwise all the
dependency tracking is rather pointless.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Евгений Шишкин 2017-11-20 15:21:14 Re: [HACKERS] Custom compression methods
Previous Message Alik Khilazhev 2017-11-20 15:16:59 Re: [HACKERS] [WIP] Zipfian distribution in pgbench