From: | "Albe Laurenz" <laurenz(dot)albe(at)wien(dot)gv(dot)at> |
---|---|
To: | "Kohei KaiGai *EXTERN*" <kaigai(at)kaigai(dot)gr(dot)jp> |
Cc: | "Atri Sharma" <atri(dot)jiit(at)gmail(dot)com>, "Alexander Korotkov *EXTERN*" <aekorotkov(at)gmail(dot)com>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Robert Haas" <robertmhaas(at)gmail(dot)com>, "PgHacker" <pgsql-hackers(at)postgresql(dot)org>, "Shigeru Hanada" <shigeru(dot)hanada(at)gmail(dot)com> |
Subject: | Re: [v9.3] writable foreign tables |
Date: | 2012-11-20 08:53:11 |
Message-ID: | D960CB61B694CF459DCFB4B0128514C208B880DE@exadv11.host.magwien.gv.at |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Kohei KaiGai wrote:
>>> This design tries to kill two-birds with one-stone.
>>> It enables to add multiple number of pseudo-columns,
>>> not only "rowid", and makes possible to push-down
>>> complex calculation of target list into external computing
>>> resource.
>>>
>>> For example, when user gives the following query:
>>>
>>> SELECT ((c1 - c2) * (c2 - c3))^2 FROM ftable
>>>
>>> it contains a complex calculation in the target-list,
>>> thus, it also takes CPU cycles of local process.
>>>
>>> If we can replace the "((c1 - c2) * (c2 - c3))^2" by
>>> a reference to a pseudo-column that also references
>>> the calculation result on external node, it effectively
>>> off-load CPU cycles.
>>>
>>> In this case, all we need to do is (1) acquire a slot
>>> for pseudo-column at GetForeignRelInfo (2) replace
>>> TargetEntry::expr by Var node that reference this
>>> pseudo-column.
>>>
>>> It makes sense for performance optimization, so I don't
>>> want to restrict this handler for "rowid" only.
>> I understand.
>>
>> But I think that you still can do that with the change that
>> I suggest. I suggest that GetForeignRelInfo (or whatever the
>> name ends up being) gets the AttrNumber of the proposed "rowid"
>> column in addition to the parameters you need for what
>> you want to do.
>>
>> Then nothing would keep you from defining those
>> pseudo-columns. But all the setup necessary for the "rowid"
>> column could be moved out of the FDW. So for the 99% of all
>> FDW which are only interested in the rowid, things would
>> get much simpler and they don't all have to implement the
>> same code.
> All we have to do at get_relation_info() to deal with pseudo-
> columns (including alternatives of complex calculation, not
> only "rowid") is just expansion of rel->max_attr.
> So, if FDW is not interested in something except for "rowid",
> it can just inform the caller "Yeah, we need just one slot for
> a pseudo-column of rowid". Otherwise, it can return another
> value to acquire the slot for arbitrary pseudo-column.
> I don't think it is a problematic design.
>
> However, I'm skeptical 99% of FDWs don't support target-list
> push-down. At least, it was very desired feature when I had
> a talk at PGconf.EU last month. :-)
I agree that PostgreSQL should make this technique possible.
My idea should not make this any more difficult.
> So, if we rename it to GetForeignRelWidth, is it defined as
> follows?
>
> extern AttrNumber
> GetForeignRelWidth(PlannerInfo *root,
> RelOptInfo *baserel,
> Oid foreigntableid,
> bool inhparent,
> List *targetList);
>
> Right now, inhparent makes no sense because foreign table
> does not support table inheritance, but it seems to me we
> shall have this functionality near future.
I am thinking of this declaration:
extern bool
GetForeignRelWidth(PlannerInfo *root,
RelOptInfo *baserel,
Oid foreigntableid,
bool inhparent,
List *targetList,
AttrNumber rowidAttr);
Let me illustrate my idea with some code.
Here's your fileGetForeignRelInfo:
static void
fileGetForeignRelInfo(PlannerInfo *root,
RelOptInfo *baserel,
Oid foreigntableid,
bool inhparent, List *targetList)
{
FileFdwPlanState *fdw_private;
ListCell *cell;
fdw_private = (FileFdwPlanState *)
palloc0(sizeof(FileFdwPlanState));
foreach(cell, targetList)
{
TargetEntry *tle = lfirst(cell);
if (tle->resjunk &&
tle->resname && strcmp(tle->resname, "rowid")==0)
{
Bitmapset *temp = NULL;
AttrNumber anum_rowid;
DefElem *defel;
pull_varattnos((Node *)tle, baserel->relid, &temp);
anum_rowid = bms_singleton_member(temp)
+ FirstLowInvalidHeapAttributeNumber;
/* adjust attr_needed of baserel */
if (anum_rowid > baserel->max_attr)
baserel->max_attr = anum_rowid;
defel = makeDefElem("anum_rowid",
(Node *)makeInteger(anum_rowid));
fdw_private->anum_rowid = defel;
}
}
baserel->fdw_private = fdw_private;
}
I hope that this can be reduced to:
static bool
fileGetForeignRelRowid(PlannerInfo *root,
RelOptInfo *baserel,
Oid foreigntableid,
bool inhparent,
List *targetList,
AttrNumber rowidAttr)
{
FileFdwPlanState *fdw_private;
fdw_private = (FileFdwPlanState *)
palloc0(sizeof(FileFdwPlanState));
defel = makeDefElem("anum_rowid",
(Node *)makeInteger(rowidAttr));
fdw_private->anum_rowid = defel;
baserel->fdw_private = fdw_private;
return true; /* we'll use rowid, so please extend baserel->max_attr
*/
}
That wouldn't mean that the FDW cannot define any other
pseudo-columns in this function, just the case of rowid
would be simplified.
Does that make any sense?
Yours,
Laurenz Albe
From | Date | Subject | |
---|---|---|---|
Next Message | Dimitri Fontaine | 2012-11-20 09:08:52 | Re: Dumping an Extension's Script |
Previous Message | Pavel Stehule | 2012-11-20 08:48:21 | Re: Timing events WIP v1 |