Re: CustomScan in a larger structure (RE: CustomScan support on readfuncs.c)

From: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: CustomScan in a larger structure (RE: CustomScan support on readfuncs.c)
Date: 2016-01-26 01:06:46
Message-ID: 9A28C8860F777E439AA12E8AEA7694F80119F58D@BPXM15GP.gisp.nec.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Sorry for my late response. I've been unavailable to have enough
time to touch code for the last 1.5 month.

The attached patch is a revised one to handle private data of
foregn/custom scan node more gracefully.

The overall consensus upthread were:
- A new ExtensibleNodeMethods structure defines a unique name
and a set of callbacks to handle node copy, serialization,
deserialization and equality checks.
- (Foreign|Custom)(Path|Scan|ScanState) are first host of the
ExtensibleNodeMethods, to allow extension to define larger
structure to store its private fields.
- ExtensibleNodeMethods does not support variable length
structure (like a structure with an array on the tail, use
separately allocated array).
- ExtensibleNodeMethods shall be registered on _PG_init() of
extensions.

The 'pgsql-v9.6-custom-private.v3.patch' is the main part of
this feature. As I pointed out before, it uses dynhash instead
of the self invented hash table.

Interfaces are defined as follows (not changed from v2):

typedef struct ExtensibleNodeMethods
{
const char *extnodename;
Size node_size;
void (*nodeCopy)(Node *newnode, const Node *oldnode);
bool (*nodeEqual)(const Node *a, const Node *b);
void (*nodeOut)(struct StringInfoData *str, const Node *node);
void (*nodeRead)(Node *node);
} ExtensibleNodeMethods;

extern void
RegisterExtensibleNodeMethods(const ExtensibleNodeMethods *methods);

extern const ExtensibleNodeMethods *
GetExtensibleNodeMethods(const char *extnodename, bool missing_ok);

Also, 'extensible-node-example-on-pgstrom.patch' is a working
example on its "GpuScan" node.
The code below uses all of copy, serialization and deserialization.

gscan = (GpuScan *)stringToNode(nodeToString(copyObject(cscan)));
elog(INFO, "GpuScan: %s", nodeToString(gscan));

Then, I could confirm private fields are reproduced correctly.

In addition to this, I'd like to suggest two small improvement.

On nodeOut callback, extension will need _outToken() and _outBitmap(),
however, these two functions are static. Entrypoint for extensions
are needed. (Of course, extension can copy & paste these small functions...)

ExtensibleNodeMethods may be registered with a unique pair of its
name and node-tag which is associated with. The current code requires
the name is unique to others, however, it may make a bit inconvenience.
In case of CustomScan, extension need to define three nodes: CustomPath,
CustomScan and CustomScanState, thus, ExtensibleNodeMethods which is
associated with these node must have individually unique name, like
"GpuScanPath", "GpuScan" and "GpuScanState".
If extnodename would be unique within a particular node type, we can
apply same name for all of the three above.

How about your thought?

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei <kaigai(at)ak(dot)jp(dot)nec(dot)com>

> -----Original Message-----
> From: Kaigai Kouhei(海外 浩平)
> Sent: Wednesday, December 02, 2015 5:52 PM
> To: 'Robert Haas'
> Cc: Andres Freund; Amit Kapila; pgsql-hackers
> Subject: Re: CustomScan in a larger structure (RE: [HACKERS] CustomScan support
> on readfuncs.c)
>
> > On Thu, Nov 26, 2015 at 5:27 AM, Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com> wrote:
> > > I'm now implementing. The above design perfectly works on ForeignScan.
> > > On the other hands, I'd like to have deeper consideration for CustomScan.
> > >
> > > My recent patch adds LibraryName and SymbolName on CustomScanMethods
> > > to lookup the method table even if library is not loaded yet.
> > > However, this ExtensibleNodeMethods relies custom scan provider shall
> > > be loaded, by parallel infrastructure, prior to the deserialization.
> > > It means extension has a chance to register itself as well.
> > >
> > > My idea is, redefine CustomScanMethod as follows:
> > >
> > > typedef struct ExtensibleNodeMethods
> > > {
> > > const char *extnodename;
> > > Size node_size;
> > > Node *(*nodeCopy)(const Node *from);
> > > bool (*nodeEqual)(const Node *a, const Node *b);
> > > void (*nodeOut)(struct StringInfoData *str, const Node *node);
> > > void (*nodeRead)(Node *node);
> > > } ExtensibleNodeMethods;
> > >
> > > typedef struct CustomScanMethods
> > > {
> > > union {
> > > const char *CustomName;
> > > ExtensibleNodeMethods xnode;
> > > };
> > > /* Create execution state (CustomScanState) from a CustomScan plan node
> > */
> > > Node *(*CreateCustomScanState) (struct CustomScan *cscan);
> > > } CustomScanMethods;
> > >
> > > It allows to pull CustomScanMethods using GetExtensibleNodeMethods
> > > by CustomName; that is alias of extnodename in ExtensibleNodeMethods.
> > > Thus, we don't need to care about LibraryName and SymbolName.
> > >
> > > This kind of trick is not needed for ForeignScan because all the pointer
> > > stuff are reproducible by catalog, however, method table of CustomScan
> > > is not.
> > >
> > > How about your opinion?
> >
> > Anonymous unions are not C89, so this is a no-go.
> >
> > I think we should just drop CustomName and replace it with
> > ExtensibleNodeMethods. That will break things for third-party code
> > that is using the existing CustomScan stuff, but there's a good chance
> > that nobody other than you has written any such code. And even if
> > someone has, updating it for this change will not be very difficult.
> >
> Thanks, I have same opinion.
> At this moment, CustomName is not utilized so much except for EXPLAIN
> and debug output, so it is not a hard stuff to replace this field by
> extnodename, even if custom scan provider does not have own structure
> thus no callbacks of node operations are defined.
>
> The attached patch adds 'extnodename' fields on ForeignPath and
> ForeignScan, also embeds ExtensibleNodeMethods in CustomPathMethods,
> CustomScanMethods and CustomExecMethods.
>
> Its handlers in copyfuncs.c were enhanced to utilize the callback
> to allocate a larger structure and copy private fields.
> Handlers in outfuncs.c and readfuncs.c have to be symmetric. The
> core backend writes out 'extnodename' prior to any private fields,
> then we can identify the callback to process rest of tokens for
> private fields.
>
> RegisterExtensibleNodeMethods() tracks a pair of 'extnodename'
> and ExtensibleNodeMethods itself. It saves the pointer of the
> method table, but not duplicate, because _readCustomScan()
> cast the method pulled by 'extnodename' thus registered table
> has to be a static variable on extension; that means extension
> never update ExtensibleNodeMethods once registered.
>
> The one other patch is my test using PG-Strom, however, I didn't
> have a test on FDW extensions yet.
>
> Thanks,
> --
> NEC Business Creation Division / PG-Strom Project
> KaiGai Kohei <kaigai(at)ak(dot)jp(dot)nec(dot)com>

Attachment Content-Type Size
extensible-node-example-on-pgstrom.patch application/octet-stream 13.0 KB
pgsql-v9.6-custom-private.v3.patch application/octet-stream 22.2 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2016-01-26 01:09:54 Re: Releasing in September
Previous Message Michael Paquier 2016-01-26 01:00:36 Re: why pg_size_pretty is volatile?