Re: Pluggable Storage - Andres's take

From: Asim R P <apraveen(at)pivotal(dot)io>
To: PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Cc: Ashwin Agrawal <aagrawal(at)pivotal(dot)io>, 9erthalion6(at)gmail(dot)com, kommi(dot)haribabu(at)gmail(dot)com, Andres Freund <andres(at)anarazel(dot)de>, alvherre(at)2ndquadrant(dot)com, a(dot)korotkov(at)postgrespro(dot)ru
Subject: Re: Pluggable Storage - Andres's take
Date: 2018-11-22 02:12:04
Message-ID: CANXE4TfZUZCg+afo+gpqiHF6O=rDfqt5a+3jTH2SD8FZ9Yd2pA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Ashwin (copied) and I got a chance to go through the latest code from
Andres' github repository. We would like to share some
comments/quesitons:

The TupleTableSlot argument is well suited for row-oriented storage.
For a column-oriented storage engine, a projection list indicating the
columns to be scanned may be necessary. Is it possible to share this
information with current interface?

We realized that DDLs such as heap_create_with_catalog() are not
generalized. Haribabu's latest patch that adds
SetNewFileNode_function() and CreateInitFort_function() is a step
towards this end. However, the current API assumes that the storage
engine uses relation forks. Isn't that too restrictive?

TupleDelete_function() accepts changingPart as a parameter to indicate
if this deletion is part of a movement from one partition to another.
Partitioning is a higher level abstraction as compared to storage.
Ideally, storage layer should have no knowledge of partitioning. The
tuple delete API should not accept any parameter related to
partitioning.

The API needs to be more accommodating towards block sizes used in
storage engines. Currently, the same block size as heap seems to be
assumed, as evident from the type of some members of generic scan
object:

typedef struct TableScanDescData
{
/* state set up at initscan time */
BlockNumber rs_nblocks; /* total number of blocks in rel */
BlockNumber rs_startblock; /* block # to start at */
BlockNumber rs_numblocks; /* max number of blocks to scan */
/* rs_numblocks is usually InvalidBlockNumber, meaning "scan whole rel" */
bool rs_syncscan; /* report location to syncscan logic? */
} TableScanDescData;

Using bytes to represent this information would be more generic. E.g.
rs_startlocation as bytes/offset instead of rs_startblock and so on.

Asim

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2018-11-22 02:15:17 Re: incorrect xlog.c coverage report
Previous Message Masahiko Sawada 2018-11-22 01:56:39 Re: incorrect xlog.c coverage report