Re: [HACKERS] Pluggable storage

From: Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>
To: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
Cc: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] Pluggable storage
Date: 2018-01-04 05:03:18
Message-ID: CAJrrPGfq2=qz2HmUUDjsueOfADOVAjx7Dbrodi9ZoAnDXO-TsA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jan 4, 2018 at 10:00 AM, Alexander Korotkov <
a(dot)korotkov(at)postgrespro(dot)ru> wrote:

> On Wed, Jan 3, 2018 at 10:08 AM, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>
> wrote:
>
>>
>> On Wed, Dec 27, 2017 at 11:33 PM, Alexander Korotkov <
>> a(dot)korotkov(at)postgrespro(dot)ru> wrote:
>>
>>>
>>> Also, I appreciate that now tuple_insert() and tuple_update() methods
>>> are responsible for inserting index tuples. This unleash pluggable
>>> storages to implement another way of interaction with indexes. However, I
>>> didn't get the point of passing InsertIndexTuples IndexFunc to them. Now,
>>> we're always passing ExecInsertIndexTuples() to this argument. As I
>>> understood storage is free to either call ExecInsertIndexTuples() or
>>> implement its own logic of interaction with indexes. But, I don't
>>> understand why do we need a callback when tuple_insert() and tuple_update()
>>> can call ExecInsertIndexTuples() directly if needed. Another thing is that
>>> tuple_delete() could also interact with indexes (especially when we will
>>> enhance index access method API), and we need to pass meta-information
>>> about indexes to tuple_delete() too.
>>>
>>
>> The main reason for which I added the callback function to not to
>> introduce the
>> dependency of storage on executor functions. This way storage can call the
>> function that is passed to it without any knowledge. I added the function
>> pointer
>> for tuple_delete also in the new patches, currently it is passed as NULL
>> for heap.
>> These API's can be enhanced later.
>>
>
> Understood, but in order to implement alternative behavior with indexes
> (for example,
> insert index tuples to only some of indexes), storage am will still have
> to call executor
> functions. So, yes this needs to be enhanced. Probably, we just need to
> implement
> nicer executor API for storage am.
>

OK.

> Apart from rebase, Added storage shared memory API, currently this API is
>> used
>> only by the syncscan. And also all the exposed functions of syncscan
>> usage is
>> removed outside the heap.
>>
>
> This makes me uneasy. You introduce two new hooks for size estimation and
> initialization
> of shared memory needed by storage am's. But if storage am is implemented
> in shared library,
> then this shared library can use our generic method for allocation of
> shared memory
> (including memory needed by storage am). If storage am is builtin, then
> hooks are also not
> needed, because we know all our builtin storage am's in advance. For me,
> it would be
> nice to encapsulate heap am requirements in shared memory into functions
> like
> HeapAmShmemSize() and HeapAmShmemInit(), and don't explicitly show outside
> that
> this memory is needed for synchronized scan. But separate hooks don't
> look justified for me.
>

Yes, I agree that for the builtin storage's there is no need of hooks. But
in future,
if we want to support multiple storage's in an instance, we may need hooks
for shared memory
registration. I am fine to change it.

Regards,
Hari Babu
Fujitsu Australia

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2018-01-04 06:09:33 Re: pgsql: Add parallel-aware hash joins.
Previous Message Jing Wang 2018-01-04 04:20:52 Libpq support to connect to standby server as priority