Re: [HACKERS] Custom compression methods

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Justin Pryzby <pryzby(at)telsasoft(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, David Steele <david(at)pgmasters(dot)net>, Ildus Kurbangaliev <i(dot)kurbangaliev(at)gmail(dot)com>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: [HACKERS] Custom compression methods
Date: 2021-02-05 05:31:54
Message-ID: CAFiTN-upHFgR2vHcWUZyWFNdkZ82TXtSMC1PQhdkUkmF5tVG7g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Feb 5, 2021 at 3:51 AM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> On Thu, Feb 4, 2021 at 11:39 AM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> > Yeah, actually, I thought I would avoid calling slot_getallattrs if
> > none of the attributes got decompress. I agree if we call this before
> > we can avoid calling slot_getattr but slot_getattr
> > is only called for the attribute which has attlen -1. I agree that if
> > we call slot_getattr for attnum n then it will deform all the
> > attributes before that. But then slot_getallattrs only need to deform
> > the remaining attributes not all. But maybe we can call the
> > slot_getallattrs as soon as we see the first attribute with attlen -1
> > and then avoid calling subsequent slot_getattr, maybe that is better
> > than compared to what I have because we will avoid calling
> > slot_getattr for many attributes, especially when there are many
> > verlena.
>
> I think that if we need to deform at all, we need to deform all
> attributes, right?

IMHO that is not true, because we might need to deform the attribute
just to check its stored compression. So for example the first
attribute is varchar and the remaining 100 attributes are interger.
So we just need to deform the first attribute and if the compression
method of that is the same as the target attribute then we are done
and not need to deform the remaining and we can just continue with the
original slot and tuple.

I am not saying this is a very practical example and we have to do it
like this, but I am just making a point that it is not true that if we
deform at all then we have to deform all. However, if we decompress
any then we have to deform all because we need to materialize the
tuple again.

So there's no point in considering e.g.
> slot_getsomeattrs(). But just slot_getallattrs() as soon as we know we
> need to do it might be worthwhile. Could even have two loops: one that
> just figures out whether we need to deform; if not, return. Then
> slot_getallattrs(). Then another loop to do the work.
>
> > I think the supported procedure for this sort of
> > > thing is to have a second slot, set tts_values, tts_isnull etc. and
> > > then materialize the slot. After materializing the new slot, it's
> > > independent of the old slot, which can then be cleared. See for
> > > example tts_virtual_materialize().
> >
> > Okay, so if we take a new slot then we need to set this slot reference
> > in the ScanState also otherwise that might point to the old slot. I
> > haven't yet analyzed where all we might be keeping the reference to
> > that old slot. Or I am missing something.
>
> My guess is you want to leave the ScanState alone so that we keep
> fetching into the same slot as before and have an extra slot on the
> side someplace.

Okay, got your point. Thanks.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2021-02-05 05:45:40 Re: Transactions involving multiple postgres foreign servers, take 2
Previous Message Hou, Zhijie 2021-02-05 05:25:05 RE: Parallel INSERT (INTO ... SELECT ...)