Re: PassDownLimitBound for ForeignScan/CustomScan

From: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Jeevan Chalke <jeevan(dot)chalke(at)enterprisedb(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: PassDownLimitBound for ForeignScan/CustomScan
Date: 2016-09-14 01:07:46
Message-ID: 9A28C8860F777E439AA12E8AEA7694F80122A00D@BPXM15GP.gisp.nec.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On Tue, Sep 13, 2016 at 3:48 AM, Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com> wrote:
> > Sorry for my late.
> >
> > The attached patch fixed the wording problems on SGML part.
>
> I agree that we should have some way for foreign data wrappers and
> custom scans and perhaps also other executor nodes to find out whether
> there's a known limit to the number of tuples that they might need to
> produce, but I wonder if we should be doing something more general
> than this. For example, suppose we add a new PlanState member "long
> numTuples" where 0 means that the number of tuples that will be needed
> is unknown (so that most node types need not initialize it), a
> positive value is an upper bound on the number of tuples that will be
> fetched, and -1 means that it is known for certain that we will need
> all of the tuples. This might be relevant to the executor batching
> stuff that Andres has been working on, because you could for example
> set ps->numTuples == 1 on the inner side of a semi-join, warning the
> executor node that it shouldn't bother trying to batch anything.
>
I also think the generic approach is a preferable direction.

In the current implementation calls recompute_limits() on the first
invocation of ExecLimit and ExecReScanLimit. Do we expect the
ps->numTuples will be also passed down to the child nodes on the same
timing?
I also think this new executor contract shall be considered as a hint
(but not a requirement) for the child nodes, because it allows the
parent nodes to re-distribute the upper limit regardless of the type
of the child nodes as long as the parent node can work correctly and
has benefit even if the child node returns a part of tuples. It makes
the decision whether the upper limit should be passed down much simple.
The child node "can" ignore the hint but can utilize for more optimization.

> On a more practical level, I notice that you haven't adapted
> postgres_fdw or file_fdw to benefit from this new callback. It seems
> like postgres_fdw could benefit, because it could fetch only the
> required number of tuples if that happens to be a smaller number than
> the configured fetch_size.
>
It is because of just my time pressure around the patch submission days.
I'll try to enhance postgres_fdw as a usage of this run-time optimization.

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei <kaigai(at)ak(dot)jp(dot)nec(dot)com>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2016-09-14 01:10:22 Re: DISABLE_PAGE_SKIPPING option for vacuumdb
Previous Message Michael Paquier 2016-09-14 01:04:38 Re: WAL consistency check facility