Re: Pluggable storage

From: Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Pluggable storage
Date: 2017-08-21 07:28:59
Message-ID: CAJrrPGcUNVweTM27_yaAHQTvi_1e3nXtc9wQs34GL2=-BE6ukg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Aug 13, 2017 at 5:17 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
wrote:

> On Sat, Aug 12, 2017 at 10:31 AM, Haribabu Kommi
> <kommi(dot)haribabu(at)gmail(dot)com> wrote:
> >>
> >> Why do we need to store handler function in TupleDesc? As of now, the
> >> above patch series has it available in RelationData and
> >> TupleTableSlot, I am not sure if instead of that keeping it in
> >> TupleDesc is a good idea. Which all kind of places require TupleDesc
> >> to contain handler? If those are few places, can we think of passing
> >> it as a parameter?
> >
> >
> > Till now I am to able to proceed without adding any storage handler
> > functions to
> > TupleDesc structure. Sure, I will try the way of passing as a parameter
> when
> > there is a need of it.
> >
>
> Okay, I think it is better if you discuss such locations before
> directly modifying those.
>

Sure. I will check with community before making any such changes.

> > During the progress of the patch, I am facing problems in designing the
> > storage API
> > regarding the Buffer. For example To replace all the
> HeapTupleSatisfiesMVCC
> > and
> > related functions with function pointers, In HeapTuple format, the tuple
> may
> > belongs
> > to one buffer, so the buffer is passed to the HeapTupleSatifisifes***
> > functions along
> > with buffer, But in case of other storage formats, the single buffer may
> not
> > contains
> > the actual data.
> >
>
> Also, it is quite possible that some of the storage Am's don't even
> want to return bool as a parameter from HeapTupleSatisfies* API's. I
> guess what we need here is to provide a way so that different storage
> am's can register their function pointer for an equivalent to
> satisfies function. So, we need to change
> SnapshotData.SnapshotSatisfiesFunc in some way so that different
> handlers can register their function instead of using that directly.
> I think that should address the problem you are planning to solve by
> omitting buffer parameter.
>

Thanks for your suggestion. Yes, it is better to go in the direction of
SnapshotSatisfiesFunc.

I verified the above idea of implementing the Tuple visibility functions
and assign them into the snapshotData structure based on the snapshot.

The Tuple visibility functions that are specific to the relation are
available
with the RelationData structure and this structure may not be available,
so I changed the SnapShotData structure to hold an enum to represent
what type of snapshot it is, instead of storing the pointer to the tuple
visibility function. Whenever there is a need to check for the tuple
visibilty
the storageam handler pointer corresponding to the snapshot type is
called and result is obtained as earlier.

> This buffer is used to set the Hint bits and mark the
> > buffer as dirty.
> > In case if the buffer is not available, the performance may affect for
> the
> > following
> > queries if the hint bits are not set.
> >
>
> I don't think it is advisable to change that for the current heap.
>

I didn't change the prototype of existing functions. Currently tuple
visibility
functions assumes that Buffer is always proper, but that may not be correct
based on the storage.

>
> > And also the Buffer is used to get from heap_fetch, heap_lock_tuple and
> > related
> > functions to check the Tuple visibility, but currently returning a buffer
> > from the above
> > heap_** function is not possible for other formats.
> >
>
> Why not? I mean if we consider that all the formats we are worried at
> this stage have TID (block number, tuple location), then we can get
> the buffer. We might want to consider passing TID as a parameter to
> these API's if required to make that possible. You also agreed above
> [1] that we can first design the API considering storage formats
> having TID.
>

The current approach is to support the storages that support TID bits.
But what I mean here, in some storage methods (for example column
storage), the tuple is not present in one buffer, the tuple data may be
calculated from many buffers and return the slot/storageTuple (until
unless we change everywhere to slot).

If any of the following code after the storage methods is expecting
a Buffer that should be valid may need some changes to check it first
whether it is a valid or not and perform the operations based on that.

> > And also for the
> > HeapTuple data,
> > the tuple data is copied into palloced buffer instead of pointing
> directly
> > to the page.
> > So, returning a Buffer is a valid or not here?
> >
>
> Yeah, but I think for the sake of compatibility and not changing too
> much in the current API's signature, we should try to avoid it.
>

Currently I am trying to avoid changing the current API's signatures.
Most of the signature changes are something like HeapTuple -> StorageTuple
and etc.

> > Currently I am proceeding to remove the Buffer as parameter in the API
> and
> > proceed
> > further, In case if it affects the performance, we need to find out a
> > different appraoch
> > in handling the hint bits.
> >
>
> Leaving aside the performance concern, I am not convinced that it is a
> good idea to remove Buffer as a parameter from the API's you have
> mentioned above. Would you mind thinking once again keeping the
> suggestions provided above in this email to see if we can avoid
> removing Buffer as a parameter?

Thanks for your suggestions.
Yes. I am able to proceed without removing Buffer parameter.

Regards,
Hari Babu
Fujitsu Australia

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2017-08-21 07:32:50 Re: expanding inheritance in partition bound order
Previous Message Ashutosh Bapat 2017-08-21 07:13:16 advanced partition matching algorithm for partition-wise join