Re: PassDownLimitBound for ForeignScan/CustomScan

From: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>
To: Jeevan Chalke <jeevan(dot)chalke(at)enterprisedb(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, "Etsuro Fujita" <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>
Subject: Re: PassDownLimitBound for ForeignScan/CustomScan
Date: 2016-09-05 03:58:02
Message-ID: 9A28C8860F777E439AA12E8AEA7694F8012245CC@BPXM15GP.gisp.nec.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On Mon, Aug 29, 2016 at 7:25 AM, Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com> wrote:
>
>
> Hello,
>
> The attached patch adds an optional callback to support special optimization
> if ForeignScan/CustomScan are located under the Limit node in plan-tree.
>
> Our sort node wisely switches the behavior when we can preliminary know
> exact number of rows to be produced, because all the Sort node has to
> return is the top-k rows when it is located under the Limit node.
> It is much lightweight workloads than sorting of entire input rows when
> nrows is not small.
>
> In my case, this information is very useful because GPU can complete its
> sorting operations mostly on L1-grade memory if we can preliminary know
> the top-k value is enough small and fits to size of the fast memory.
>
> Probably, it is also valuable for Fujita-san's case because this information
> allows to attach "LIMIT k" clause on the remote query of postgres_fdw.
> It will reduce amount of the network traffic and remote CPU consumption
> once we got support of sort pushdown.
>
>
>
> One thing we need to pay attention is cost estimation on the planner stage.
> In the existing code, only create_ordered_paths() and
> create_merge_append_path()
> considers the limit clause for cost estimation of sorting. They use the
> 'limit_tuples' of PlannerInfo; we can reference the structure when extension
> adds ForeignPath/CustomPath, so I think we don't need a special enhancement
> on the planner stage.
>
Thanks for your comments.

> I believe this hook is gets called at execution time.
> So to push LIMIT clause like you said above we should use "limit_tuples" at the time
> of planning and then use this hook to optimize at runtime, right?
>
Yes. For more correctness, a valid "limit_tuples" of PlannerInfo is set only when
LIMIT clause takes constant values; it is true for most of use case.
Then, the hook I added shall be called at execution time for more exact optimization.

If FDW/CSP cannot accept uncertain number of rows to generate on planning time,
it is not a duty to provide its own path which is optimized for small number of
LIMIT clause.

> Apart from that, attached patch applies cleanly on latest sources and found no issues
> with make or with regressions.
>
> However this patch is an infrastructure for any possible optimization when
> foreign/customscan is under LIMIT.
>
> So look good to me.
>
> I quickly tried adding a hook support in postgres_fdw, and it gets called correctly
> when we have foreignscan with LIMIT (limit being evaluated on local server).
>
> So code wise no issue. Also add this hook details in documentation.
>
OK, I'll try to write up some detailed documentation stuff; not only API specification.

Best regards,

>
> Thanks
>
>
>
> Thanks,
> --
> NEC Business Creation Division / PG-Strom Project
> KaiGai Kohei <kaigai(at)ak(dot)jp(dot)nec(dot)com>
>
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
> <http://www.postgresql.org/mailpref/pgsql-hackers>
>
>
>
>
>
>
> --
>
> Jeevan B Chalke
> Principal Software Engineer, Product Development
> EnterpriseDB Corporation
> The Enterprise PostgreSQL Company
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2016-09-05 04:03:03 Re: Speed up Clog Access by increasing CLOG buffers
Previous Message Tsunakawa, Takayuki 2016-09-05 03:12:01 Re: [bug fix] Cascading standby cannot catch up and get stuck emitting the same message repeatedly