Re: Pluggable Storage - Andres's take

From: Andres Freund <andres(at)anarazel(dot)de>
To: Ashwin Agrawal <aagrawal(at)pivotal(dot)io>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Rafia Sabih <rafia(dot)pghackers(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Asim R P <apraveen(at)pivotal(dot)io>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
Subject: Re: Pluggable Storage - Andres's take
Date: 2019-05-08 21:46:27
Message-ID: 20190508214627.hw7wuqwawunhynj6@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2019-05-07 23:18:39 -0700, Ashwin Agrawal wrote:
> On Mon, May 6, 2019 at 1:39 PM Ashwin Agrawal <aagrawal(at)pivotal(dot)io> wrote:
> > Also wish to point out, while working on Zedstore, we realized that
> > TupleDesc from Relation object can be trusted at AM layer for
> > scan_begin() API. As for ALTER TABLE rewrite case (ATRewriteTables()),
> > catalog is updated first and hence the relation object passed to AM
> > layer reflects new TupleDesc. For heapam its fine as it doesn't use
> > the TupleDesc today during scans in AM layer for scan_getnextslot().
> > Only TupleDesc which can trusted and matches the on-disk layout of the
> > tuple for scans hence is from TupleTableSlot. Which is little
> > unfortunate as TupleTableSlot is only available in scan_getnextslot(),
> > and not in scan_begin(). Means if AM wishes to do some initialization
> > based on TupleDesc for scans can't be done in scan_begin() and forced
> > to delay till has access to TupleTableSlot. We should at least add
> > comment for scan_begin() to strongly clarify not to trust Relation
> > object TupleDesc. Or maybe other alternative would be have separate
> > API for rewrite case.
>
> Just to correct my typo, I wish to say, TupleDesc from Relation object can't
> be trusted at AM layer for scan_begin() API.
>
> Andres, any thoughts on above. I see you had proposed "change the
> table_beginscan* API so it
> provides a slot" in [1], but seems received no response/comments that time.
> [1]
> https://www.postgresql.org/message-id/20181211021340.mqaown4njtcgrjvr%40alap3.anarazel.de

I don't think passing a slot at beginscan time is a good idea. There's
several places that want to use different slots for the same scan, and
we probably want to increase that over time (e.g. for batching), not
decrease it.

What kind of initialization do you want to do based on the tuple desc at
beginscan time?

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2019-05-08 21:51:35 Re: Inconsistency between table am callback and table function names
Previous Message Andres Freund 2019-05-08 21:43:35 Re: Pluggable Storage - Andres's take