Re: Pluggable storage

From: Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>
To: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
Cc: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Pluggable storage
Date: 2017-09-19 07:34:48
Views: Raw Message | Whole Thread | Download mbox
Lists: pgsql-hackers

On Fri, Sep 15, 2017 at 5:10 AM, Alexander Korotkov <
a(dot)korotkov(at)postgrespro(dot)ru> wrote:

> On Thu, Sep 14, 2017 at 8:17 AM, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>
> wrote:
>> Instead of modifying the Bitmap Heap and Sample scan's to avoid referring
>> the internal members of the HeapScanDesc, I divided the HeapScanDesc
>> into two parts.
>> 1. StorageScanDesc
>> 2. HeapPageScanDesc
>> The StorageScanDesc contains the minimal information that is required
>> outside
>> the Storage routine and this must be provided by all storage routines.
>> This
>> structure contains minimal information such as relation, snapshot, buffer
>> and
>> etc.
>> The HeapPageScanDesc contains other extra information that is required for
>> Bitmap Heap and Sample scans to work. This structure contains the
>> information
>> of blocks, visible offsets and etc. Currently this structure is used only
>> in
>> Bitmap Heap and Sample scan and it's supported contrib modules, except
>> the pgstattuple module. The pgstattuple needs some additional changes.
>> By adding additional storage API to return HeapPageScanDesc as it required
>> by the Bitmap Heap and Sample scan's and this API is called only in these
>> two scan's. And also these scan methods are choosen by the planner only
>> when the storage routine supports to returning of HeapPageScanDesc API.
>> Currently Implemented the planner support only for Bitmap, yet to do it
>> for Sample scan.
>> With the above approach, I removed all the references of HeapScanDesc
>> outside the heap. The changes of this approach is available in the
>> 0008-Remove-HeapScanDesc-usage-outside-heap.patch
>> Suggestions/comments with the above approach.
> For me, that's an interesting idea. Naturally, the way BitmapHeapScan and
> SampleScan work even on very high-level is applicable only for some storage
> AMs (i.e. heap-like storage AMs). For example, index-organized table
> wouldn't ever support BitmapHeapScan, because it refers tuples by PK values
> not TIDs. However, in this case, storage AM might have some alternative to
> our BitmapHeapScan. So, index-organized table might have some compressed
> representation of ordered PK values set and use it for bulk fetch of PK
> index.
> Therefore, I think it would be nice to make BitmapHeapScan an
> heap-storage-AM-specific scan method while other storage AMs could provide
> other storage-AM-specific scan methods. Probably it would be too much for
> this patchset and should be done during one of next work cycles on storage
> AM (I'm sure that such huge project as pluggable storage AMs would have
> multiple iterations).

Thanks for your opinion. Yes, that was my first thought of making these
two scan methods as part of the storage AMs. I feel the approach of just
exposing some additional hooks doesn't look good. This may need some
better infrastructure to provide storage AMs of their own scan methods.

Because of this reason, currently I developed the temporary approach of
separating HeapScanDesc into two structures.

> Similarly, SampleScans contain storage-AM-specific logic. For instance,
> our SYSTEM sampling method fetches random blocks from heap providing high
> performance way to sample heap. Coming back to the example of
> index-organized table, it could provide it's own storage-AM-specific table
> sampling methods including sophisticated PK tree traversal with fetching
> random small ranges of PK. Given that tablesample methods are already
> pluggable, making them storage-AM-specific would lead to user-visible
> changes. I.e. tablesample method should be created for particular storage
> AM or set of storage AMs. However, I didn't yet figure out what should API
> exactly look like...

Regarding SampleScans, I feel we can follow the same approach of supporting
particular sample methods with particular storage AMs similar like Bitmap
I didn't check it completely.

Rebased patches are attached.

Hari Babu
Fujitsu Australia

Attachment Content-Type Size
0008-Remove-HeapScanDesc-usage-outside-heap.patch application/octet-stream 90.7 KB
0001-Change-Create-Access-method-to-include-storage-handl.patch application/octet-stream 9.7 KB
0002-Storage-AM-API-hooks-and-related-functions.patch application/octet-stream 18.4 KB
0003-Adding-storageam-hanlder-to-relation-structure.patch application/octet-stream 7.0 KB
0004-Adding-tuple-visibility-function-to-storage-AM.patch application/octet-stream 145.9 KB
0005-slot-hooks-are-added-to-storage-AM.patch application/octet-stream 60.1 KB
0006-Tuple-Insert-API-is-added-to-Storage-AM.patch application/octet-stream 246.2 KB
0007-Scan-functions-are-added-to-storage-AM.patch application/octet-stream 176.5 KB

In response to


Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Khandekar 2017-09-19 07:45:38 Re: UPDATE of partition key
Previous Message Masahiko Sawada 2017-09-19 07:31:32 Re: Block level parallel vacuum WIP