Re: Wrong cost estimation for foreign tables join with use_remote_estimate disabled

From: Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>
To: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Wrong cost estimation for foreign tables join with use_remote_estimate disabled
Date: 2018-06-22 11:09:35
Message-ID: 13c689b7-0724-5fa9-558e-ae2be68094d7@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 22.06.2018 13:30, Ashutosh Bapat wrote:
> On Fri, Jun 22, 2018 at 11:56 AM, Konstantin Knizhnik
> <k(dot)knizhnik(at)postgrespro(dot)ru> wrote:
>>
>> On 21.06.2018 20:08, Tom Lane wrote:
>>> Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru> writes:
>>>> The following very simple test reduce the problem with wrong cost
>>>> estimation:
>>>> create foreign table t1_fdw(x integer, y integer) server pg_fdw options
>>>> (table_name 't1', use_remote_estimate 'false');
>>>> create foreign table t2_fdw(x integer) server pg_fdw options (table_name
>>>> 't2', use_remote_estimate 'false');
>>>> It is possible to force Postgres to use correct plan by setting
>>>> "fdw_startup_cost" to some very large value (100000000 for example).
>>>> ...
>>>> Also correct plan is used when use_remote_estimate is true.
>>> If you are unhappy about the results with use_remote_estimate off, don't
>>> run it that way. The optimizer does not have a crystal ball.
>>
>> As I wrote, use_remote_estimate can not be used because in this case query
>> compilation time is unacceptable (10 seconds, while time of query execution
>> itself is ~200msec).
>> So the problem can be addressed in two ways:
>>
>> 1. Try to reduce time of remote estimation. I wonder why postgres_fdw sends
>> so much queries to remote server. For join of two tables there are 7
>> queries.
>> I suspect that for ~20 joined tables in the original query number of calls
>> is more than hundred, so on wonder that it takes so much time.
>> 2. Try to make optimizer make better estimation of join cost based on local
>> statistic (please notice that ANALYZE is explicitly called for all foreign
>> tables and number of rows in the result was correctly calculated).
>>
> I think estimate_path_cost_size() is too pessimistic about how many
> times the join conditions are evaluated (Sorry, I have written that
> code when I was worked on join pushdown for postgres_fdw.)
>
> /* Estimate of number of rows in cross product */
> nrows = fpinfo_i->rows * fpinfo_o->rows;
>
> and somewhere down in the code
> run_cost += nrows * join_cost.per_tuple;
>
> It assumes that the join conditions are run on the cross-product of
> the joining tables. In reality that never happens for large tables. In
> such cases the optimizer will choose either hash or merge join, which
> will apply join conditions only on a small portion of cross-product.
> But the reason it was written that way was the local server can not
> estimate the fraction of cross product on which the join conditions
> will be applied. May be we could assume that the join conditions will
> be applied to only 1% of the cross product, i.e. run_cost +=
> clamp_rows(nrows/100) * join_cost.per_tuple. With this change I think
> the cost of remote plan will be less than local plan.
>
> Here's a preview of blog, I am planning to publish soon, about this
> issue at [1]. It has a bit more details.
>
> [1] https://www.blogger.com/blogger.g?blogID=5253679863234367862#editor/target=post;postID=4019325618679658571;onPublishedMenu=allposts;onClosedMenu=allposts;postNum=0;src=postname
>
Yes, postgres_fdw is very conservative or event pessimistic regarding
cost of joins.
It really assumes that there will be cross join with applied filter,
which is not true in most cases.
It's a pity. especially taken in account that based on local statistic
it is able to correctly predict number of rows in the result of join.
So if w take in account this estimated number of retrieved rows in
calculation of join cost, then estimation is more correct and right plan
(with remote joins) is chosen:

diff --git a/contrib/postgres_fdw/postgres_fdw.c
b/contrib/postgres_fdw/postgres_fdw.c
index 78b0f43..84b30ce 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -2812,9 +2812,8 @@ estimate_path_cost_size(PlannerInfo *root,
                         * 4. Run time cost of applying nonpushable
other clauses locally
                         * on the result fetched from the foreign server.
                         */
-                       run_cost = fpinfo_i->rel_total_cost -
fpinfo_i->rel_startup_cost;
-                       run_cost += fpinfo_o->rel_total_cost -
fpinfo_o->rel_startup_cost;
-                       run_cost += nrows * join_cost.per_tuple;
+                       run_cost = (fpinfo_i->rel_total_cost -
fpinfo_i->rel_startup_cost) * retrieved_rows / fpinfo_i->rows ;
+                       run_cost += (fpinfo_o->rel_total_cost -
fpinfo_o->rel_startup_cost) * retrieved_rows / fpinfo_o->rows;
                        nrows = clamp_row_est(nrows *
fpinfo->joinclause_sel);
                        run_cost += nrows * remote_conds_cost.per_tuple;
                        run_cost += fpinfo->local_conds_cost.per_tuple
* retrieved_rows;

I also tried to do something with first approach: speed up remote
estimation. My idea was to use local estimation whenever possible and
use remote estimation only for joins.
In case of joining two tables it cause sending only one EXPLAIN request
to remote server. But for larger number of joined table amount of
considered pathes and so number of remote EXPLAIN
requests is growing very fast. For 4 joins 22 explains are issued. So
the approach with restricting number of foreign estimations seems to be
more difficult to implement and less promising.

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrey V. Lepikhov 2018-06-22 11:24:13 Re: [WIP] [B-Tree] Retail IndexTuple deletion
Previous Message Ashutosh Bapat 2018-06-22 10:36:27 Re: Wrong cost estimation for foreign tables join with use_remote_estimate disabled