|From:||Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>|
|To:||Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>, Jeevan Chalke <jeevan(dot)chalke(at)enterprisedb(dot)com>|
|Cc:||"pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, "Etsuro Fujita" <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>|
|Subject:||Re: PassDownLimitBound for ForeignScan/CustomScan|
|Views:||Raw Message | Whole Thread | Download mbox | Resend email|
> -----Original Message-----
> From: pgsql-hackers-owner(at)postgresql(dot)org
> [mailto:pgsql-hackers-owner(at)postgresql(dot)org] On Behalf Of Kouhei Kaigai
> Sent: Monday, September 05, 2016 12:58 PM
> To: Jeevan Chalke
> Cc: pgsql-hackers(at)postgresql(dot)org; Etsuro Fujita
> Subject: Re: [HACKERS] PassDownLimitBound for ForeignScan/CustomScan
> > On Mon, Aug 29, 2016 at 7:25 AM, Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com> wrote:
> > Hello,
> > The attached patch adds an optional callback to support special optimization
> > if ForeignScan/CustomScan are located under the Limit node in plan-tree.
> > Our sort node wisely switches the behavior when we can preliminary know
> > exact number of rows to be produced, because all the Sort node has to
> > return is the top-k rows when it is located under the Limit node.
> > It is much lightweight workloads than sorting of entire input rows when
> > nrows is not small.
> > In my case, this information is very useful because GPU can complete its
> > sorting operations mostly on L1-grade memory if we can preliminary know
> > the top-k value is enough small and fits to size of the fast memory.
> > Probably, it is also valuable for Fujita-san's case because this information
> > allows to attach "LIMIT k" clause on the remote query of postgres_fdw.
> > It will reduce amount of the network traffic and remote CPU consumption
> > once we got support of sort pushdown.
> > One thing we need to pay attention is cost estimation on the planner stage.
> > In the existing code, only create_ordered_paths() and
> > create_merge_append_path()
> > considers the limit clause for cost estimation of sorting. They use the
> > 'limit_tuples' of PlannerInfo; we can reference the structure when extension
> > adds ForeignPath/CustomPath, so I think we don't need a special enhancement
> > on the planner stage.
> Thanks for your comments.
> > I believe this hook is gets called at execution time.
> > So to push LIMIT clause like you said above we should use "limit_tuples" at the time
> > of planning and then use this hook to optimize at runtime, right?
> Yes. For more correctness, a valid "limit_tuples" of PlannerInfo is set only when
> LIMIT clause takes constant values; it is true for most of use case.
> Then, the hook I added shall be called at execution time for more exact optimization.
> If FDW/CSP cannot accept uncertain number of rows to generate on planning time,
> it is not a duty to provide its own path which is optimized for small number of
> LIMIT clause.
> > Apart from that, attached patch applies cleanly on latest sources and found no issues
> > with make or with regressions.
> > However this patch is an infrastructure for any possible optimization when
> > foreign/customscan is under LIMIT.
> > So look good to me.
> > I quickly tried adding a hook support in postgres_fdw, and it gets called correctly
> > when we have foreignscan with LIMIT (limit being evaluated on local server).
> > So code wise no issue. Also add this hook details in documentation.
> OK, I'll try to write up some detailed documentation stuff; not only API specification.
The v2 patch attached. It introduces the role of this hook and how extension
utilizes the LIMIT clause for its further optimization on planning and
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei <kaigai(at)ak(dot)jp(dot)nec(dot)com>
|Next Message||Michael Paquier||2016-09-06 01:41:27||Re: Speedup twophase transactions|
|Previous Message||Craig Ringer||2016-09-06 01:20:22||Re: [PATCH] COPY vs \copy HINT|