Quick Links

Re: Custom/Foreign-Join-APIs (Re: [v9.5] Custom Plan API)

From:	Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>
To:	Shigeru HANADA <shigeru(dot)hanada(at)gmail(dot)com>
Cc:	Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Thom Brown <thom(at)linux(dot)com>, "pgsql-hackers(at)postgreSQL(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Custom/Foreign-Join-APIs (Re: [v9.5] Custom Plan API)
Date:	2015-04-09 12:22:15
Message-ID:	9A28C8860F777E439AA12E8AEA7694F8010CDAD4@BPXM15GP.gisp.nec.co.jp
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

> 2015/04/09 10:48、Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com> のメール：
> * merge_fpinfo()
> >>> It seems to me fpinfo->rows should be joinrel->rows, and
> >>> fpinfo->width also should be joinrel->width.
> >>> No need to have special intelligence here, isn't it?
> >>
> >>
> >> Oops. They are vestige of my struggle which disabled SELECT clause optimization
> >> (omit unused columns). Now width and rows are inherited from joinrel.
> Besides
> >> that, fdw_startup_cost and fdw_tuple_cost seem wrong, so I fixed them to use
> simple
> >> summary, not average.
> >>
> > Does fpinfo->fdw_startup_cost represent a cost to open connection to remote
> > PostgreSQL, doesn't it?
> >
> > postgres_fdw.c:1757 says as follows:
> >
> > /*
> > * Add some additional cost factors to account for connection overhead
> > * (fdw_startup_cost), transferring data across the network
> > * (fdw_tuple_cost per retrieved row), and local manipulation of the data
> > * (cpu_tuple_cost per retrieved row).
> > */
> >
> > If so, does a ForeignScan that involves 100 underlying relation takes 100
> > times heavy network operations on startup? Probably, no.
> > I think, average is better than sum, and max of them will reflect the cost
> > more correctly.
>
> In my current opinion, no. Though I remember that I've written such comments
> before :P.
>
> Connection establishment occurs only once for the very first access to the server,
> so in the use cases with long-lived session (via psql, connection pooling, etc.),
> taking connection overhead into account *every time* seems too pessimistic.
>
> Instead, for practical cases, fdw_startup_cost should consider overheads of query
> construction and getting first response of it (hopefully it minus retrieving
> actual data). These overheads are visible in the order of milliseconds. I’m
> not sure how much is appropriate for the default, but 100 seems not so bad.
>
> Anyway fdw_startup_cost is per-server setting as same as fdw_tuple_cost, and it
> should not be modified according to the width of the result, so using
> fpinfo_o->fdw_startup_cost would be ok.
>
Indeed, I forgot the connection cache mechanism. As long as we define
fdw_startup_cost as you mentioned, it seems to me your logic is heuristically
reasonable.

> > Also, fdw_tuple_cost introduce the cost of data transfer over the network.
> > I thinks, weighted average is the best strategy, like:
> > fpinfo->fdw_tuple_cost =
> > (fpinfo_o->width / (fpinfo_o->width + fpinfo_i->width) *
> fpinfo_o->fdw_tuple_cost +
> > (fpinfo_i->width / (fpinfo_o->width + fpinfo_i->width) *
> fpinfo_i->fdw_tuple_cost;
> >
> > That's just my suggestion. Please apply the best way you thought.
>
> I can’t agree that strategy, because 1) width 0 causes per-tuple cost 0, and 2)
> fdw_tuple_cost never vary in a foreign server. Using fpinfo_o->fdw_tuple_cost
> (it must be identical to fpinfo_i->fdw_tuple_cost) seems reasonable. Thoughts?
>
OK, you are right.

I think it is time to hand over the patch reviewing to committers.
So, let me mark it "ready for committers".

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei <kaigai(at)ak(dot)jp(dot)nec(dot)com>

Responses

Re: Custom/Foreign-Join-APIs (Re: [v9.5] Custom Plan API) at 2015-04-10 06:00:58 from Shigeru HANADA

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Magnus Hagander	2015-04-09 12:31:46	psql showing owner in \dT
Previous Message	Tom Lane	2015-04-09 12:20:45	Re: "rejected" vs "returned with feedback" in new CF app