Re: [HACKERS] Custom compression methods

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, David Steele <david(at)pgmasters(dot)net>, Ildus Kurbangaliev <i(dot)kurbangaliev(at)gmail(dot)com>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: [HACKERS] Custom compression methods
Date: 2021-03-16 05:46:08
Message-ID: CAFiTN-vqPt1-kRKLnLpB0w2MsAgibRsqJesRaHAehAmGqkMxBg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 16, 2021 at 4:28 AM Andres Freund <andres(at)anarazel(dot)de> wrote:

Replying to some of the comments..

> - Is nodeModifyTable.c really the right place for the logic around
> CompareCompressionMethodAndDecompress()? And is doing it in every
> place that does "user initiated" inserts really the right way? Why
> isn't this done on the tuptoasting level?

I think if we do in tuptoasting level then it will be even costlier
because in nodeModifyTable.c at least many time we will get the
virtual tuple e.g. if a user is directly inserting the tuple but once
we go down to tuptoasting level by then we will always get the
HeapTuple and we will have to deform in every case where tupdesc has
any varlena because we don't have any flag in the tuple header to tell
us whether there are any compressed data or not. In the below
thread[1] we have considered these 2 approaches and basically, in
unrelated paths like pg_bench, we did not see any performance
regression with any of those approaches.

[1] https://www.postgresql.org/message-id/CAFiTN-vcbfy5ScKVUp16c1N_wzP0RL6EkPBAg_Jm3eDK0ftO5Q%40mail.gmail.com

> > I'm open to being convinced that we don't need to do either of these
> > things, and that the cost of iterating over all varlenas in the tuple
> > is not so bad as to preclude doing things as you have them here. But,
> > I'm afraid it's going to be too expensive.
>
> I mean, I would just define several of those places away by not caring
> about tuples in a different compressino formation ending up in a
> table...

I am just wondering that why we don't need to process in case of
storage change, I mean if the target table has the attribute storage
as external and if there are some compressed data coming from the
source table then we will be inserting those compressed data as it is
in the target attribute without externalizing. Maybe it is done to
avoid such performance impacts? Well, we can do the same for the
compression also and just provide some mechanism to recompress maybe
in vacuum full/cluster.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Justin Pryzby 2021-03-16 05:48:19 Re: [HACKERS] Custom compression methods
Previous Message Chengxi Sun 2021-03-16 05:42:49 Re: "has_column_privilege()" issue with attnums and non-existent columns