Re: Problems with plan estimates in postgres_fdw

From: Antonin Houska <ah(at)cybertec(dot)at>
To: Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Problems with plan estimates in postgres_fdw
Date: 2019-03-01 11:00:41
Message-ID: 7273.1551438041@localhost
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp> wrote:

> (2019/02/22 22:54), Antonin Houska wrote:
> > Etsuro Fujita<fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
> >> On the other hand, the code bit added by
> >> 0001-postgres_fdw-Perform-UPPERREL_ORDERED-step-remotely-v3.patch handles the
> >> case where a post-scan/join processing step other than grouping/aggregation
> >> (ie, the final sort or LIMIT/OFFSET) is performed directly on a base or join
> >> relation, as in the LIMIT example. So, the two changes are handling different
> >> cases, hence both changes would be required.
> >
> > Do you mean this part?
>
> No, I don't. What I meant was this part:
>
> + /*
> + * If this is an UPPERREL_ORDERED step performed on the final
> + * scan/join relation, the costs obtained from the cache wouldn't yet
> + * contain the eval costs for the final scan/join target, which would
> + * have been updated by apply_scanjoin_target_to_paths(); add the eval
> + * costs now.
> + */
> + if (fpextra && !IS_UPPER_REL(foreignrel))
> + {
> + /* The costs should have been obtained from the cache. */
> + Assert(fpinfo->rel_startup_cost >= 0 &&
> + fpinfo->rel_total_cost >= 0);
> +
> + startup_cost += foreignrel->reltarget->cost.startup;
> + run_cost += foreignrel->reltarget->cost.per_tuple * rows;
> + }

> Yeah, but I think the code bit cited above is needed. Let me explain using
> yet another example with grouping and ordering:
>
> SELECT a+b, random() FROM foreign_table GROUP BY a+b ORDER BY a+b;
>
> For this query, the sort_input_target would be {a+b}, (to which the
> foreignrel->reltarget would have been set when called from
> estimate_path_cost_size() called from postgresGetForeignUpperPaths() with the
> UPPERREL_ORDERED stage), but the final target would be {a+b, random()}, so the
> sort_input_target isn't the same as the final target in that case; the costs
> needs to be adjusted so that the costs include the ones of generating the
> final target. That's the reason why I added the code bit you cited.

I used gdb to help me understand, however the condition

if (fpextra && !IS_UPPER_REL(foreignrel))

never evaluated to true with the query above. fpextra was only !=NULL when
estimate_path_cost_size() was called from add_foreign_ordered_paths(), but at
the same time foreignrel->reloptkind was RELOPT_UPPER_REL.

It's still unclear to me why add_foreign_ordered_paths() passes the input
relation (input_rel) to estimate_path_cost_size(). If it passed the output rel
(i.e. ordered_rel in this case) like add_foreign_grouping_paths() does, then
foreignrel->reltarget should contain the random() function. (What I see now is
that create_ordered_paths() leaves ordered_rel->reltarget empty, but that's
another problem to be fixed.)

Do you still see a reason to call estimate_path_cost_size() this way?

> > Note about the !activeWindows test: It occurs me now that no paths should be
> > added to UPPERREL_ORDERED relation if the query needs WindowAgg node, because
> > postgres_fdw currently cannot handle UPPERREL_WINDOW relation. Or rather in
> > general, the FDW should not skip any stage just because it doesn't know how to
> > build the paths.
>
> That's right. In postgres_fdw, by the following code in
> postgresGetForeignUpperPaths(), we will give up on doing the UPPERREL_ORDERED
> step remotely, if the query has the UPPERREL_WINDOW (and/or UPPERREL_DISTINCT)
> steps, because in that case we would have input_rel->fdw_private=NULL for a
> call to postgresGetForeignUpperPaths() with the UPPERREL_ORDERED stage:
>
> /*
> * If input rel is not safe to pushdown, then simply return as we cannot
> * perform any post-join operations on the foreign server.
> */
> if (!input_rel->fdw_private ||
> !((PgFdwRelationInfo *) input_rel->fdw_private)->pushdown_safe)
> return;

I understand now: if FDW did not handle the previous stage, then
input_rel->fdw_private is NULL.

--
Antonin Houska
https://www.cybertec-postgresql.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2019-03-01 11:13:04 Re: Add exclusive backup deprecation notes to documentation
Previous Message Etsuro Fujita 2019-03-01 10:45:45 Question about commit 11cf92f6e2e13c0a6e3f98be3e629e6bd90b74d5