Re: Pluggable Storage - Andres's take

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp
Cc: kommi(dot)haribabu(at)gmail(dot)com, 9erthalion6(at)gmail(dot)com, andres(at)anarazel(dot)de, alvherre(at)2ndquadrant(dot)com, a(dot)korotkov(at)postgrespro(dot)ru, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Pluggable Storage - Andres's take
Date: 2018-12-13 05:59:00
Message-ID: 20181213.145900.131706809.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-hackers

Hello.

At Tue, 27 Nov 2018 14:58:35 +0900, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp> wrote in <080ce65e-7b96-adbf-1c8c-7c88d87eaeda(at)lab(dot)ntt(dot)co(dot)jp>
> + <para>
> +<programlisting>
> +TupleTableSlotOps *
> +slot_callbacks (Relation relation);
> +</programlisting>
> + API to access the slot specific methods;
> + Following methods are available;
> + <structname>TTSOpsVirtual</structname>,
> + <structname>TTSOpsHeapTuple</structname>,
> + <structname>TTSOpsMinimalTuple</structname>,
> + <structname>TTSOpsBufferTuple</structname>,
> + </para>
>
> Unless I'm misunderstanding what the TupleTableSlotOps abstraction is or
> its relations to the TableAmRoutine abstraction, I think the text
> description could better be written as:
>
> "API to get the slot operations struct for a given table access method"
>
> It's not clear to me why various TTSOps* structs are listed here? Is the
> point that different AMs may choose one of the listed alternatives? For
> example, I see that heap AM implementation returns TTOpsBufferTuple, so it
> manipulates slots containing buffered tuples, right? Other AMs are free
> to return any one of these? For example, some AMs may never use buffer
> manager and hence not use TTOpsBufferTuple. Is that understanding correct?

Yeah, I'm not sure why it should not be a pointer to the struct
itself but a function. And the four struct doesn't seem relevant
to table AMs. Perhaps clear, getsomeattrs and so on should be
listed instead.

> + <para>
> +<programlisting>
> +Oid
> +tuple_insert (Relation rel, TupleTableSlot *slot, CommandId cid,
> + int options, BulkInsertState bistate);
> +</programlisting>
> + API to insert the tuple and provide the <literal>ItemPointerData</literal>
> + where the tuple is successfully inserted.
> + </para>
>
> It's not clear from the signature where you get the ItemPointerData.
> Looking at heapam_tuple_insert which puts it in slot->tts_tid, I think
> this should mention it a bit differently, like:
>
> API to insert the tuple contained in the provided slot and return its TID,
> that is, the location where the tuple is successfully inserted

It is actually an OID, not a TID in the current code. TID is
internaly handled.

> + <para>
> +<programlisting>
> +bool
> +tuple_fetch_follow (struct IndexFetchTableData *scan,
> + ItemPointer tid,
> + Snapshot snapshot,
> + TupleTableSlot *slot,
> + bool *call_again, bool *all_dead);
> +</programlisting>
> + API to get the all the tuples of the page that satisfies itempointer.
> + </para>
>
> IIUC, "all the tuples of of the page" in the above sentence means all the
> tuples in the HOT chain of a given heap tuple, making this description of
> the API slightly specific to the heap AM. Can we make the description
> more generic or is the API itself very specific that it cannot be
> expressed in generic terms? Ignoring that for a moment, I think the
> sentence contains more "the"s than there need to be, so maybe write as:
>
> API to get all tuples on a given page that are linked to the tuple of the
> given TID

Mmm. This is exposing MVCC matters to indexam. I suppose we
should refactor this API.

> + <para>
> +<programlisting>
> +tuple_data
> +get_tuple_data (TupleTableSlot *slot, tuple_data_flags flags);
> +</programlisting>
> + API to return the internal structure members of the HeapTuple.
> + </para>
>
> I think this description doesn't mention enough details of both the
> information that needs to be specified when calling the function (what's
> in flags) and the information that's returned.

(I suppose it will be described in later sections.)

> + <para>
> +<programlisting>
> +bool
> +scan_analyze_next_tuple (TableScanDesc scan, TransactionId OldestXmin,
> + double *liverows, double *deadrows, TupleTableSlot
> *slot));
> +</programlisting>
> + API to analyze the block and fill the buffered heap tuple in the slot
> and also
> + provide the live and dead rows.
> + </para>
>
> How about:
>
> API to get the next tuple from the block being scanned, which also updates
> the number of live and dead rows encountered

"live" and "dead" are MVCC terms. I suppose that we should stash
out the deadrows somwhere else. (But analyze code would need to
be modified if we do so.)

> +void
> +scansetlimits (TableScanDesc sscan, BlockNumber startBlk, BlockNumber
> numBlks);
> +</programlisting>
> + API to fix the relation scan range limits.
> + </para>
>
>
> How about:
>
> API to set scan range endpoints

This sets start point and the number of blocks.. Just "API to set
scan range" would be sifficient reffering to the parameter list.

> + <para>
> +<programlisting>
> +bool
> +scan_bitmap_pagescan (TableScanDesc scan,
> + TBMIterateResult *tbmres);
> +</programlisting>
> + API to scan the relation and fill the scan description bitmap with
> valid item pointers
> + for the specified block.
> + </para>
>
> This says "to scan the relation", but seems to be concerned with only a
> page worth of data as the name also says. Also, it's not clear what "scan
> description bitmap" means. Maybe write as:
>
> API to scan the relation block specified in the scan descriptor to collect
> and return the tuples requested by the given bitmap

"API to collect the tuples in a page requested by the given
bitmpap scan result." something? I think detailed explanation
would be required apart from the one-line description. Anyway the
name TBMIterateResult doesn't seem proper to expose.

> + <para>
> +<programlisting>
> +bool
> +scan_sample_next_block (TableScanDesc scan, struct SampleScanState
> *scanstate);
> +</programlisting>
> + API to scan the relation and fill the scan description bitmap with
> valid item pointers
> + for the specified block provided by the sample method.
> + </para>
>
> Looking at the code, this API selects the next block using the sampling
> method and nothing more, although I see that the heap AM implementation
> also does heapgetpage thus collecting live tuples in the array known only
> to heap AM. So, how about:
>
> API to select the next block of the relation using the given sampling
> method and set its information in the scan descriptor

"block" and "page" seems randomly choosed here and there. I don't
mind that seen in the core but..

> + <para>
> +<programlisting>
> +bool
> +scan_sample_next_tuple (TableScanDesc scan, struct SampleScanState
> *scanstate, TupleTableSlot *slot);
> +</programlisting>
> + API to fill the buffered heap tuple data from the bitmap scanned item
> pointers based on the sample
> + method and store it in the provided slot.
> + </para>
>
> How about:
>
> API to select the next tuple using the given sampling method from the set
> of tuples collected from the block previously selected by the sampling method

I'm not sure "from the set of tuples collected" is true. Just
"the state of sample scan" or something wouldn't be fine?

> + <para>
> +<programlisting>
> +void
> +scan_rescan (TableScanDesc scan, ScanKey key, bool set_params,
> + bool allow_strat, bool allow_sync, bool allow_pagemode);
> +</programlisting>
> + API to restart the relation scan with provided data.
> + </para>
>
> How about:
>
> API to restart the given scan using provided options, releasing any
> resources (such as buffer pins) already held by the scan

It looks too-detailed to me, but "with provided data" looks too
coarse..

> + <para>
> +<programlisting>
> +void
> +scan_update_snapshot (TableScanDesc scan, Snapshot snapshot);
> +</programlisting>
> + API to update the relation scan with the new snapshot.
> + </para>
>
> How about:
>
> API to set the visibility snapshot to be used by a given scan

If so, the function name should be "scan_set_snapshot". Anyway
the name look like "the function to update a snapshot (itself)".

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro HORIGUCHI 2018-12-13 05:59:20 Re: Pluggable Storage - Andres's take
Previous Message Michael Paquier 2018-12-13 05:55:07 Re: Cache lookup errors with functions manipulation object addresses