From: | Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp> |
---|---|
To: | Antonin Houska <ah(at)cybertec(dot)at> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Problems with plan estimates in postgres_fdw |
Date: | 2019-03-01 12:42:47 |
Message-ID: | 5C7928C7.50507@lab.ntt.co.jp |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
(2019/03/01 20:00), Antonin Houska wrote:
> Etsuro Fujita<fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
>> (2019/02/22 22:54), Antonin Houska wrote:
>>> Etsuro Fujita<fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
>>>> So, the two changes are handling different
>>>> cases, hence both changes would be required.
>> + /*
>> + * If this is an UPPERREL_ORDERED step performed on the final
>> + * scan/join relation, the costs obtained from the cache wouldn't yet
>> + * contain the eval costs for the final scan/join target, which would
>> + * have been updated by apply_scanjoin_target_to_paths(); add the eval
>> + * costs now.
>> + */
>> + if (fpextra&& !IS_UPPER_REL(foreignrel))
>> + {
>> + /* The costs should have been obtained from the cache. */
>> + Assert(fpinfo->rel_startup_cost>= 0&&
>> + fpinfo->rel_total_cost>= 0);
>> +
>> + startup_cost += foreignrel->reltarget->cost.startup;
>> + run_cost += foreignrel->reltarget->cost.per_tuple * rows;
>> + }
>
>> Yeah, but I think the code bit cited above is needed. Let me explain using
>> yet another example with grouping and ordering:
>>
>> SELECT a+b, random() FROM foreign_table GROUP BY a+b ORDER BY a+b;
>>
>> For this query, the sort_input_target would be {a+b}, (to which the
>> foreignrel->reltarget would have been set when called from
>> estimate_path_cost_size() called from postgresGetForeignUpperPaths() with the
>> UPPERREL_ORDERED stage), but the final target would be {a+b, random()}, so the
>> sort_input_target isn't the same as the final target in that case; the costs
>> needs to be adjusted so that the costs include the ones of generating the
>> final target. That's the reason why I added the code bit you cited.
>
> I used gdb to help me understand, however the condition
>
> if (fpextra&& !IS_UPPER_REL(foreignrel))
>
> never evaluated to true with the query above.
Sorry, my explanation was not enough again, but I showed that query
("SELECT a+b, random() FROM foreign_table GROUP BY a+b ORDER BY a+b;")
to explain why the following code bit is needed:
+ /*
+ * If this includes an UPPERREL_ORDERED step, the given target,
which
+ * would be the final target to be applied to the resulting
path, might
+ * have different expressions from the underlying relation's
reltarget
+ * (see make_sort_input_target()); adjust tlist eval costs.
+ */
+ if (fpextra&& fpextra->target != foreignrel->reltarget)
+ {
+ QualCost oldcost = foreignrel->reltarget->cost;
+ QualCost newcost = fpextra->target->cost;
+
+ startup_cost += newcost.startup - oldcost.startup;
+ total_cost += newcost.startup - oldcost.startup;
+ total_cost += (newcost.per_tuple - oldcost.per_tuple) *
rows;
+ }
I think that the condition
if (fpextra && fpextra->target != foreignrel->reltarget)
would be evaluated to true for that query.
> It's still unclear to me why add_foreign_ordered_paths() passes the input
> relation (input_rel) to estimate_path_cost_size(). If it passed the output rel
> (i.e. ordered_rel in this case) like add_foreign_grouping_paths() does, then
> foreignrel->reltarget should contain the random() function. (What I see now is
> that create_ordered_paths() leaves ordered_rel->reltarget empty, but that's
> another problem to be fixed.)
>
> Do you still see a reason to call estimate_path_cost_size() this way?
Yeah, the reason for that is because we can estimate the costs of
sorting for the final scan/join relation using the existing code as-is
that estimates the costs of sorting for base or join relations, except
for tlist eval cost adjustment. As you mentioned, we could pass
ordered_rel to estimate_path_cost_size(), but if so, I think we would
need to get input_rel (ie, the final scan/join relation) from
ordered_rel within estimate_path_cost_size(), to use that code, which
would not be great.
Best regards,
Etsuro Fujita
From | Date | Subject | |
---|---|---|---|
Next Message | Surafel Temesgen | 2019-03-01 12:47:43 | Re: FETCH FIRST clause PERCENT option |
Previous Message | David Rowley | 2019-03-01 12:35:22 | Re: using index or check in ALTER TABLE SET NOT NULL |