Re: Pluggable Storage - Andres's take

From: Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Pluggable Storage - Andres's take
Date: 2018-07-17 13:01:36
Message-ID: CAJrrPGfXGoDxZGg7oWYwr70xKx5ET+rwt3sUTGrTBimVvERgQQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jul 16, 2018 at 11:35 PM Andres Freund <andres(at)anarazel(dot)de> wrote:

> On 2018-07-04 20:11:21 +1000, Haribabu Kommi wrote:
> > On Tue, Jul 3, 2018 at 5:06 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
>
> > > The most fundamental issues I had with Haribabu's last version from [2]
> > > are the following:
> > >
> > > - The use of TableTuple, a typedef from void *, is bad from multiple
> > > fronts. For one it reduces just about all type safety. There were
> > > numerous bugs in the patch where things were just cast from HeapTuple
> > > to TableTuple to HeapTuple (and even to TupleTableSlot). I think
> it's
> > > a really, really bad idea to introduce a vague type like this for
> > > development purposes alone, it makes it way too hard to refactor -
> > > essentially throwing the biggest benefit of type safe languages out
> of
> > > the window.
> > >
> >
> > My earlier intention was to remove the HeapTuple usage entirely and
> replace
> > it with slot everywhere outside the tableam. But it ended up with
> TableTuple
> > before it reach to the stage because of heavy use of HeapTuple.
>
> I don't think that's necessary - a lot of the system catalog accesses
> are going to continue to be heap tuple accesses. And the conversions you
> did significantly continue to access TableTuples as heap tuples - it was
> just that the compiler didn't warn about it anymore.
>
> A prime example of that is the way the rewriteheap / cluster
> integreation was done. Cluster continued to sort tuples as heap tuples -
> even though that's likely incompatible with other tuple formats which
> need different state.
>

OK. Understood.

> > > - the heap tableam did a heap_copytuple() nearly everywhere. Leading to
> > > a higher memory usage, because the resulting tuples weren't freed or
> > > anything. There might be a reason for doing such a change - we've
> > > certainly discussed that before - but I'm *vehemently* against doing
> > > that at the same time we introduce pluggable storage. Analyzing the
> > > performance effects will be hard enough without changes like this.
> > >
> >
> > How about using of slot instead of tuple and reusing of it? I don't know
> > yet whether it is possible everywhere.
>
> Not quite sure what you mean?
>

I thought of using slot everywhere can reduce the use of heap_copytuple,
I understand that you already did those changes from you reply from the
other mail.

> > Tasks / Questions:
> > >
> > > - split up patch
> > >
> >
> > How about generating refactoring changes as patches first based on
> > the code in your repo as discussed here[1]?
>
> Sure - it was just at the moment too much work ;)
>

Yes, it is too much work. How about doing this once most of the
open items are finished?

>
> > > - Change heap table AM to not allocate handler function for each table,
> > > instead allocate it statically. Avoids a significant amount of data
> > > duplication, and allows for a few more compiler optimizations.
> > >
> >
> > Some kind of static variable handlers for each tableam, but need to check
> > how can we access that static handler from the relation.
>
> I'm not sure what you mean by "how can we access"? We can just return a
> pointer from the constant data from the current handler? Except that
> adding a bunch of consts would be good, the external interface wouldn't
> need to change?
>

I mean we may need to store some tableam ID in each table, so that based on
that ID we get the static tableam handler, because at a time there may be
tables from different tableam methods.

>
> > > - change scan level slot creation to use tableam function for doing so
> > > - get rid of slot->tts_tid, tts_tupleOid and potentially tts_tableOid
> > >
> >
> > so with this there shouldn't be a way from slot to tid mapping or there
> > should be some other way.
>
> I'm not sure I follow?
>

Replacing of heaptuple with slot, currently with slot only the tid is passed
to the tableam methods, To get rid of the tid from slot, we may need any
other method of passing?

> > - bitmap index scans probably need a new tableam.h callback, abstracting
> > > bitgetpage()
> > >
> >
> > OK.
>
> Any chance you could try to tackle this? I'm going to be mostly out
> this week, so we'd probably not run across each others feet...
>

OK, I will take care of the above point.

Regards,
Haribabu Kommi
Fujitsu Australia

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2018-07-17 13:03:28 Re: [HACKERS] PATCH: multivariate histograms and MCV lists
Previous Message Andrey Klychkov 2018-07-17 12:36:46 Re[2]: Alter index rename concurrently to