Re: Problems with plan estimates in postgres_fdw

From: Antonin Houska <ah(at)cybertec(dot)at>
To: Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Problems with plan estimates in postgres_fdw
Date: 2019-02-07 17:04:48
Message-ID: 10994.1549559088@localhost
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp> wrote:

> (2018/12/28 15:50), Etsuro Fujita wrote:
> > Attached is a new version of the
> > patch.
>
> Here is an updated version of the patch set. Changes are:

I've spent some time reviewing the patches.

First, I wonder if the information on LIMIT needs to be passed to the
FDW. Cannot the FDW routines use root->tuple_fraction? I think (although have
not verified experimentally) that the problem reported in [1] is not limited
to the LIMIT clause. I think the same problem can happen if client uses cursor
and sets cursor_tuple_fraction very low. Since the remote node is not aware of
this setting when running EXPLAIN, the low fraction can also make postgres_fdw
think that retrieval of the whole set is cheaper than it will appear to be
during the actual execution.

Some comments on coding:

0001-postgres_fdw-Perform-UPPERREL_ORDERED-step-remotely-v3.patch
-----------------------------------------------------------------

* I'm not sure if UPPERREL_ORDERD should to deal with LIMIT at all. Note that
grouping_planner() does not call create_limit_path() until it's done with
create_ordered_paths(), and create_ordered_paths() is where the FDW is
requested to add its paths to UPPERREL_ORDERED relation.

* add_foreign_ordered_paths()

Exprssion root->upper_targets[UPPERREL_FINAL] is used to access the target.

I think create_ordered_paths() should actually set the target so that
postgresGetForeignUpperPaths() can simply find it in
output_rel->reltarget. This is how postgres_fdw already retrieves the target
for grouped paths. Small patch is attached that makes this possible.

* regression tests: I think test(s) should be added for queries that have
ORDER BY clause but do not have GROUP BY (and also no LIMIT / OFFSET)
clause. I haven't noticed such tests.

0002-postgres_fdw-Perform-UPPERREL_FINAL-step-remotely-v3.patch
---------------------------------------------------------------

* add_foreign_final_paths()

Similar problem I reported for add_foreign_ordered_paths() above.

* adjust_limit_rows_costs()

Doesn't this function address more generic problem than the use of
postgres_fdw? If so, then it should be submitted as a separate patch. BTW, the
function is only used in pathnode.c, so it should be declared static.

Both patches:
------------

One thing that makes me confused when I try to understand details is that the
new functions add_foreign_ordered_paths() and add_foreign_final_paths() both
pass "input_rel" for the "foreignrel" argument of estimate_path_cost_size():

estimate_path_cost_size(root, input_rel, ...)

whereas the existing add_foreign_grouping_paths() passes "grouped_rel":

estimate_path_cost_size(root, grouped_rel, ...)

Can you please make this consistent? If you do, you'll probably need to
reconsider your changes to estimate_path_cost_size().

And maybe related problem: why should FDW care about the effect of
apply_scanjoin_target_to_paths() like you claim to do here?

/*
* If this is UPPERREL_ORDERED and/or UPPERREL_FINAL steps on the
* final scan/join relation, the costs obtained from the cache
* wouldn't yet contain the eval cost for the final scan/join target,
* which would have been updated by apply_scanjoin_target_to_paths();
* add the eval cost now.
*/
if (fpextra && !IS_UPPER_REL(foreignrel))
{
/* The costs should have been obtained from the cache. */
Assert(fpinfo->rel_startup_cost >= 0 &&
fpinfo->rel_total_cost >= 0);

startup_cost += foreignrel->reltarget->cost.startup;
run_cost += foreignrel->reltarget->cost.per_tuple * rows;
}

I think it should not, whether "foreignrel" means "input_rel" or "output_rel"
from the perspective of postgresGetForeignUpperPaths().

[1] https://www.postgresql.org/message-id/87pnz1aby9.fsf@news-spur.riddles.org.uk

--
Antonin Houska
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26, A-2700 Wiener Neustadt
Web: https://www.cybertec-postgresql.com

Attachment Content-Type Size
targets.patch text/x-diff 1.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Fetter 2019-02-07 17:20:47 Re: Tighten up a few overly lax regexes in pg_dump's tap tests
Previous Message Nathan Wagner 2019-02-07 16:57:22 Re: bug tracking system