|From:||Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>|
|To:||Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>|
|Subject:||Re: Push down more full joins in postgres_fdw|
|Views:||Raw Message | Whole Thread | Download mbox | Resend email|
On 2016/09/06 22:07, Ashutosh Bapat wrote:
> On Fri, Sep 2, 2016 at 3:55 PM, Etsuro Fujita
> <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp <mailto:fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>> wrote:
> On 2016/08/22 15:49, Ashutosh Bapat wrote:
> 1. deparsePlaceHolderVar looks odd - each of the deparse*
> function is
> named as deparse + <name of the parser node the string would parse
> into>. PlaceHolderVar is not a parser node, so no string is
> going to be
> parsed as a PlaceHolderVar. May be you want to name it as
> deparseExerFromPlaceholderVar or something like that.
> The name "deparseExerFromPlaceholderVar" seems long to me. How
> about "deparsePlaceHolderExpr"?
> There isn't any node with name PlaceHolderExpr.
I'll rename it to "deparseExerInPlaceholderVar", then.
> 3. I think registerAlias stuff should happen really at the time of
> creating paths and should be stored in fpinfo. Without that it
> will be
> computed every time we deparse the query. This means every time
> we try
> to EXPLAIN the query at various levels in the join tree and once
> for the
> query to be sent to the foreign server.
> Hmm. I think the overhead in calculating aliases would be
> negligible compared to the overhead in explaining each remote query
> for costing or sending the remote query for execution. So, I
> created aliases in the same way as remote params created and stored
> into params_list in deparse_expr_cxt. I'm not sure it's worth
> complicating the code.
> We defer remote parameter creation till deparse time since the the
> parameter numbers are dependent upon the sequence in which we deparse
> the query. Creating them at the time of path creation and storing them
> in fpinfo is not possible because a. those present in the joining
> relations will conflict with each other and need some kind of
> serialization at the time of deparsing b. those defer for differently
> parameterized paths or paths with different pathkeys. We don't defer it
> because it's very light on performance.
> That's not true with the alias information. As long as we detect which
> relations need subqueries, their RTIs are enough to create unique
> aliases e.g. if a relation involves RTIs 1, 2, 3 corresponding subquery
> can have alias r123 without conflicting with any other alias.
Hmm. But another concern about the approach of generating an subselect
alias for a path, if needed, at the path creation time would be that the
path might be rejected by add_path, which would result in totally
wasting cycles for generating the subselect alias for the path.
> However minimum overhead it might have, we will deparse the query every
> time we create a path involving that kind of relation i.e. for different
> pathkeys, different parameterization and different joins in which that
> relation participates in. This number can be significant. We want to
> avoid this overhead if we can.
Exactly. As I said above, I think the overhead would be negligible
compared to the overhead in explaining each remote query for costing or
the overhead in sending the final remote query for execution, though.
> 5. The blocks related to inner and outer relations in
> deparseFromExprForRel() look same. We should probably separate
> that code
> out into a function and call it at two places.
> I see you have created function deparseOperandRelation() for the
> same. I guess, this should be renamed as deparseRangeTblRef() since it
> will create RangeTblRef node when parsed back.
OK, if there no opinions of others, I'll rename it to the name proposed
by you, "deparseRangeTblRef".
> ! if (is_placeholder)
> ! errcontext("placeholder expression at position %d in
> select list",
> ! errpos->cur_attno);
> A user wouldn't know what a placeholder expression is, there is
> no such
> term in the documentation. We have to device a better way to
> provide an
> error context for an expression in general.
> Though I proposed that, I don't think that it's that important to
> let users know that the expression is from a PHV. How about just
> saying "expression", not "placeholder expression", so that we have
> the message "expression at position %d in select list" in the context?
> Hmm, that's better than placeholder expression, but not as explanatory
> as it should be since we won't be printing the "select list" in the
> error context and user won't have a clue as to what error context is
I don't think so. Consider an example of the conversion error message,
which is from the regression test:
SELECT ft1.c1, ft2.c2, ft1 FROM ft1, ft2 WHERE ft1.c1 = ft2.c1 AND
ft1.c1 = 1;
ERROR: invalid input syntax for integer: "foo"
CONTEXT: whole-row reference to foreign table "ft1"
As shown in the example, the error message is displayed under a remote
query for execution. So, ISTM it's reasonable to print something like
"expression at position %d in select list" in the context if an
expression in a PHV.
> But I don't have any good suggestions right now. May be we should
> print the whole expression? But that can be very long, I don't know.
ISTM that it's a bit too expensive to print the whole expression in the
> This patch tries to do two things at a time 1. support join pushdown for
> FULL join when the joining relations have remote conditions 2. better
> support for fetching placeholder vars, whole row references and some
> system columns. To make reviews easy, I think we should split the patch
> into two 1. supporting subqueries to be deparsed with support for one of
> the above (I will suggest FULL join support) 2. support for the other.
OK, will try.
|Next Message||Etsuro Fujita||2016-09-08 10:55:03||Re: Push down more UPDATEs/DELETEs in postgres_fdw|
|Previous Message||Simon Riggs||2016-09-08 10:18:16||Re: Bug in two-phase transaction recovery|