Re: Pluggable storage

From: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
To: Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>
Cc: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Pluggable storage
Date: 2017-09-27 16:51:23
Message-ID: CAPpHfduejBW2SeVD=TYF013Q-bRhR+P0CVs0xotFNEr5dKGRbw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi!

I took a look on this patch. I've following notes for now.

tuple_insert_hook tuple_insert; /* heap_insert */
> tuple_update_hook tuple_update; /* heap_update */
> tuple_delete_hook tuple_delete; /* heap_delete */

I don't think this set of functions provides good enough level of
abstraction for storage AM. This functions encapsulate only low-level work
of insert/update/delete tuples into heap itself. However, it still assumed
that indexes are managed outside of storage AM. I don't think this is
right, assuming that one of most demanded storage API usage would be
different MVCC implementation (for instance, UNDO log based as discussed
upthread). Different MVCC implementation is likely going to manage indexes
in a different way. For example, storage AM utilizing UNDO would implement
in-place update even when indexed columns are modified. Therefore this
piece of code in ExecUpdate() wouldn't be relevant anymore.

/*
> * insert index entries for tuple
> *
> * Note: heap_update returns the tid (location) of the new tuple in
> * the t_self field.
> *
> * If it's a HOT update, we mustn't insert new index entries.
> */
> if ((resultRelInfo->ri_NumIndices > 0) &&
> !storage_tuple_is_heaponly(resultRelationDesc, tuple))
> recheckIndexes = ExecInsertIndexTuples(slot, &(slot->tts_tid),
> estate, false, NULL, NIL);

I'm firmly convinced that this logic should be encapsulated into storage AM
altogether with inserting new index tuples on storage insert. Also, HOT
should be completely encapsulated into heapam. It's quite evident for me
that storage utilizing UNDO wouldn't have anything like our current HOT.
Therefore, I think there shouldn't be hot_search_buffer() API function.
tuple_fetch() may cover hot_search_buffer(). That might require some
signature change of tuple_fetch() (probably, extra arguments).

LockTupleMode and HeapUpdateFailureData shouldn't be private of heapam.
Any fullweight OLTP storage AM should support our tuple lock modes and
should be able to report update failures. HeapUpdateFailureData should be
renamed to something like StorageUpdateFailureData. Contents of
HeapUpdateFailureData seems to me general enough to be supported by any
storage with ItemPointer tuple locator.

storage_setscanlimits() is used only during index build. I think that
since storage AM may have different MVCC implementation then storage AM
should decide how to communicate with indexes including index build.
Therefore, instead of exposing storage_setscanlimits(), the
whole IndexBuildHeapScan() should be encapsulated into storage AM.

Also, BulkInsertState should be private of structure of heapam. Another
storages may have another state for bulk insert. On API level we might
have some abstract pointer instead of BulkInsertState while having
GetBulkInsertState and others as API methods.

storage_freeze_tuple() is called only once from rewrite_heap_tuple(). That
makes me think that tuple_freeze API method is wrong for abstraction. We
probably should make rewrite_heap_tuple() or even the
whole rebuild_relation() an API method...

Heap reloptions are untouched for now. Storage AM should be able to
provide its own specific options just like index AMs do.

That's all I have for now.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2017-09-27 17:00:32 Re: Binary search in fmgr_isbuiltin() is a bottleneck.
Previous Message Bossart, Nathan 2017-09-27 16:20:43 Re: [Proposal] Allow users to specify multiple tables in VACUUM commands