Re: Pluggable Storage - Andres's take

From: Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>
To: Asim R P <apraveen(at)pivotal(dot)io>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, aagrawal(at)pivotal(dot)io, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
Subject: Re: Pluggable Storage - Andres's take
Date: 2018-11-27 01:48:36
Message-ID: CAJrrPGc66Bcv5_Lkw+6yqSzptEw2mh8o=JH05poNh4BZYuGN2w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Nov 22, 2018 at 1:12 PM Asim R P <apraveen(at)pivotal(dot)io> wrote:

> Ashwin (copied) and I got a chance to go through the latest code from
> Andres' github repository. We would like to share some
> comments/quesitons:
>

Thanks for the review.

> The TupleTableSlot argument is well suited for row-oriented storage.
> For a column-oriented storage engine, a projection list indicating the
> columns to be scanned may be necessary. Is it possible to share this
> information with current interface?
>

Currently all the interfaces are designed for row-oriented storage, as you
said we need a new API for projection list. The current patch set itself
is big and it needs to stabilized and then in the next set of the patches,
those new API's will be added that will be useful for columnar storage.

> We realized that DDLs such as heap_create_with_catalog() are not
> generalized. Haribabu's latest patch that adds
> SetNewFileNode_function() and CreateInitFort_function() is a step
> towards this end. However, the current API assumes that the storage
> engine uses relation forks. Isn't that too restrictive?
>

Current set of API has many assumptions and uses the existing framework.
Thanks for your point, will check it how to enhance it.

> TupleDelete_function() accepts changingPart as a parameter to indicate
> if this deletion is part of a movement from one partition to another.
> Partitioning is a higher level abstraction as compared to storage.
> Ideally, storage layer should have no knowledge of partitioning. The
> tuple delete API should not accept any parameter related to
> partitioning.
>

Thanks for your point, will look into it in how to change extract it.

> The API needs to be more accommodating towards block sizes used in
> storage engines. Currently, the same block size as heap seems to be
> assumed, as evident from the type of some members of generic scan
> object:
>
> typedef struct TableScanDescData
> {
> /* state set up at initscan time */
> BlockNumber rs_nblocks; /* total number of blocks in rel */
> BlockNumber rs_startblock; /* block # to start at */
> BlockNumber rs_numblocks; /* max number of blocks to scan */
> /* rs_numblocks is usually InvalidBlockNumber, meaning "scan whole rel"
> */
> bool rs_syncscan; /* report location to syncscan logic? */
> } TableScanDescData;
>
> Using bytes to represent this information would be more generic. E.g.
> rs_startlocation as bytes/offset instead of rs_startblock and so on.
>

I doubt that this may not be the only one that needs a change to support
different block sizes for different storage interfaces. Thanks for your
point,
but definitely this can be taken care in the next set of patches.

Andres, as the tupletableslot changes are committed, do you want me to
share the rebased pluggable storage patch? you already working on it?

Regards,
Haribabu Kommi
Fujitsu Australia

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Gierth 2018-11-27 01:49:30 Re: IMMUTABLE and PARALLEL SAFE function markings
Previous Message Michael Paquier 2018-11-27 01:27:17 Re: Handling of REGRESS_OPTS in MSVC for regression tests