Re: PassDownLimitBound for ForeignScan/CustomScan [take-2]

From: Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
To: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>
Cc: Jeevan Chalke <jeevan(dot)chalke(at)enterprisedb(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: PassDownLimitBound for ForeignScan/CustomScan [take-2]
Date: 2017-01-03 03:07:11
Message-ID: CADyhKSXsWy6b3i5Ot6=9KiiBpPKOP00RFKCr9i+SwqarxUhpgw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Oops, I oversight this patch was marked as "returned with feedback",
not "moved to the next CF".

Its status has not been changed since the last update. (Code was
revised according to the last
comment by Jeevan, but CF-Nov was time up at that time.)

How do I handle the patch?

2016-12-05 16:49 GMT+09:00 Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>:
> Hello,
>
> Sorry for my late response.
> The attached patch reflects your comments.
>
>> Here are few comments on latest patch:
>>
>>
>> 1.
>> make/make check is fine, however I am getting regression failure in
>> postgres_fdw contrib module (attached regression.diff).
>> Please investigate and fix.
>>
> It was an incorrect interaction when postgres_fdw tries to push down
> sorting to the remote side. We cannot attach LIMIT clause on the plain
> scan path across SORT, however, the previous version estimated the cost
> for the plain scan with LIMIT clause even if local sorting is needed.
> If remote scan may return just 10 rows, estimated cost of the local sort
> is very lightweight, thus, this unreasonable path was chosen.
> (On the other hands, its query execution results were correct because
> ps_numTuples is not delivered across Sort node, so ForeignScan eventually
> scanned all the remote tuples. It made correct results but not optimal
> from the viewpoint of performance.)
>
> The v3 patch estimates the cost with remote LIMIT clause only if supplied
> pathkey strictly matches with the final output order of the query, thus,
> no local sorting is expected.
>
> Some of the regression test cases still have different plans but due to
> the new optimization by remote LIMIT clause.
> Without remote LIMIT clause, some of regression test cases preferred
> remote-JOIN + local-SORT then local-LIMIT.
> Once we have remote-LIMIT option, it allows to discount the cost for
> remote-SORT by choice of top-k heap sorting.
> It changed the optimizer's decision on some test cases.
>
> Potential one big change is the test case below.
>
> -- CROSS JOIN, not pushed down
> EXPLAIN (VERBOSE, COSTS OFF)
> SELECT t1.c1, t2.c1 FROM ft1 t1 CROSS JOIN ft2 t2 ORDER BY t1.c1, t2.c1 OFFSET 100 LIMIT 10;
>
> It assumed CROSS JOIN was not pushed down due to the cost for network
> traffic, however, remote LIMIT reduced the estimated number of tuples
> to be moved. So, all of the CROSS JOIN + ORDER BY + LIMIT became to run
> on the remote side.
>
>> 2.
>> + *
>> + * MEMO: root->limit_tuples is not attached when query
>> contains
>> + * grouping-clause or aggregate functions. So, we don's adjust
>> + * rows even if LIMIT <const> is supplied.
>>
>> Can you please explain why you are not doing this for grouping-clause or
>> aggregate functions.
>>
> See grouping_planner() at optimizer/plan/planner.c
> It puts an invalid value on the root->limit_tuples if query has GROUP BY clause,
> so we cannot know number of tuples to be returned even if we have upper limit
> actually.
>
> /*
> * Figure out whether there's a hard limit on the number of rows that
> * query_planner's result subplan needs to return. Even if we know a
> * hard limit overall, it doesn't apply if the query has any
> * grouping/aggregation operations, or SRFs in the tlist.
> */
> if (parse->groupClause ||
> parse->groupingSets ||
> parse->distinctClause ||
> parse->hasAggs ||
> parse->hasWindowFuncs ||
> parse->hasTargetSRFs ||
> root->hasHavingQual)
> root->limit_tuples = -1.0;
> else
> root->limit_tuples = limit_tuples;
>
>> 3.
>> Typo:
>>
>> don's => don't
>>
> Fixed,
>
> best regards,
> ----
> PG-Strom Project / NEC OSS Promotion Center
> KaiGai Kohei <kaigai(at)ak(dot)jp(dot)nec(dot)com>
>
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

--
KaiGai Kohei <kaigai(at)kaigai(dot)gr(dot)jp>

Attachment Content-Type Size
passdown-limit-fdw.v3.patch application/octet-stream 18.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2017-01-03 03:15:30 Re: Odd behavior with PG_TRY
Previous Message Tom Lane 2017-01-03 03:02:55 Re: [PATCH] Reload SSL certificates on SIGHUP