Re: [HACKERS] Custom compression methods

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Justin Pryzby <pryzby(at)telsasoft(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, David Steele <david(at)pgmasters(dot)net>, Ildus Kurbangaliev <i(dot)kurbangaliev(at)gmail(dot)com>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: [HACKERS] Custom compression methods
Date: 2021-03-01 15:23:09
Message-ID: CAFiTN-shkKUM+UnKBuJ-B3mZMXPiR9-XHApmUHZnDf=qx9CgAg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Mar 1, 2021 at 5:36 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
>
> On Mon, Mar 1, 2021 at 11:06 AM Justin Pryzby <pryzby(at)telsasoft(dot)com> wrote:
>
> > Thanks. It seems like that explains it.
> > I think if that's a problem with recent versions, then you'll have to
> > conditionally disable slicing.
> > https://packages.debian.org/liblz4-dev
> >
> > Slicing isn't generally usable if it sometimes makes people's data inaccessible
> > and gives errors about corruption.
> >
> > I guess you could make it a compile time test on these constants (I don't know
> > the necessary version, though)
> >
> > #define LZ4_VERSION_MAJOR 1 /* for breaking interface changes */
> > #define LZ4_VERSION_MINOR 7 /* for new (non-breaking) interface capabilities */
> > #define LZ4_VERSION_RELEASE 1 /* for tweaks, bug-fixes, or development */
> > #define LZ4_VERSION_NUMBER (LZ4_VERSION_MAJOR *100*100 + LZ4_VERSION_MINOR *100 + LZ4_VERSION_RELEASE)
> >
> > If the version is too low, either make it #error, or disable slicing.
> > The OS usual library version infrastructure will make sure the runtime version
> > is at least the MAJOR+MINOR of the compile time version.
>
> I think we can check the version and if it too low i.e. below1.8.3 (
> in this release the slicing issue was fixed) then we can call the full
> decompression routine from the slicing function.

I have done that in the attached patch. Along with that, I have also
fixed the other issues raised by Justin related to the compression
method GUC patch and also removed the stuff from the GUC patch which
is not required for the built-in methods.

Now, I think the only pending thing is related to the expandedrecord,
basically, currently, we have detoasted the compressed filed only in
expanded_record_set_field_internal function. I am still not
completely sure that for the built-in types do we need to do something
for expanded_record_set_tuple and expanded_record_set_field or not, I
mean in these functions do we only expand the external to survive the
COMMIT/ROLLBACK or do we also expand it send it to some target table
like we do in expanded_record_set_field_internal.

As Robert mentioned upthread fixing in expanded_record_set_field might
not be very problematic as the tuple is already deformed but we have a
problem in expanded_record_set_tuple as we might need to deform the
tuple even though there are no compressed/external data.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Attachment Content-Type Size
v29-0001-Disallow-compressed-data-inside-container-types.patch text/x-patch 47.4 KB
v29-0004-default-to-with-lz4.patch text/x-patch 1.7 KB
v29-0003-Add-default_toast_compression-GUC.patch text/x-patch 8.6 KB
v29-0002-Built-in-compression-method.patch text/x-patch 111.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Steele 2021-03-01 15:30:17 2019-03 CF now in progress
Previous Message Laurenz Albe 2021-03-01 15:10:26 Re: A reloption for partitioned tables - parallel_workers