Re: Push down more full joins in postgres_fdw

From: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
To: Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Push down more full joins in postgres_fdw
Date: 2016-09-06 13:07:12
Message-ID: CAFjFpRdcC7Ykb1SkARBYikx9ubKniBiAgHqMD9e47TxzY2EYFw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Sep 2, 2016 at 3:55 PM, Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>
wrote:

> Hi Ashutosh,
>
> On 2016/08/22 15:49, Ashutosh Bapat wrote:
>
>> 1. deparsePlaceHolderVar looks odd - each of the deparse* function is
>> named as deparse + <name of the parser node the string would parse
>> into>. PlaceHolderVar is not a parser node, so no string is going to be
>> parsed as a PlaceHolderVar. May be you want to name it as
>> deparseExerFromPlaceholderVar or something like that.
>>
>
> The name "deparseExerFromPlaceholderVar" seems long to me. How about
> "deparsePlaceHolderExpr"?
>

There isn't any node with name PlaceHolderExpr.

> 2. You will need to check phlevelsup member while assessing whether a
>> PHV is safe to push down.
>>
>
> Good catch! In addition to that, I added another check that the eval_at
> set for the PHV should be included in the relids set for the foreign
> relation. I think that would make the shippability check more robust.
>

Thanks.

>
> 3. I think registerAlias stuff should happen really at the time of
>> creating paths and should be stored in fpinfo. Without that it will be
>> computed every time we deparse the query. This means every time we try
>> to EXPLAIN the query at various levels in the join tree and once for the
>> query to be sent to the foreign server.
>>
>
> Hmm. I think the overhead in calculating aliases would be negligible
> compared to the overhead in explaining each remote query for costing or
> sending the remote query for execution. So, I created aliases in the same
> way as remote params created and stored into params_list in
> deparse_expr_cxt. I'm not sure it's worth complicating the code.
>

We defer remote parameter creation till deparse time since the the
parameter numbers are dependent upon the sequence in which we deparse the
query. Creating them at the time of path creation and storing them in
fpinfo is not possible because a. those present in the joining relations
will conflict with each other and need some kind of serialization at the
time of deparsing b. those defer for differently parameterized paths or
paths with different pathkeys. We don't defer it because it's very light on
performance.

That's not true with the alias information. As long as we detect which
relations need subqueries, their RTIs are enough to create unique aliases
e.g. if a relation involves RTIs 1, 2, 3 corresponding subquery can have
alias r123 without conflicting with any other alias.

However minimum overhead it might have, we will deparse the query every
time we create a path involving that kind of relation i.e. for different
pathkeys, different parameterization and different joins in which that
relation participates in. This number can be significant. We want to avoid
this overhead if we can.

>
> 5. The blocks related to inner and outer relations in
>> deparseFromExprForRel() look same. We should probably separate that code
>> out into a function and call it at two places.
>>
>
> Done.
>

Thanks. I see you have created function deparseOperandRelation() for the
same. I guess, this should be renamed as deparseRangeTblRef() since it will
create RangeTblRef node when parsed back.

>
> 6.
>> ! if (is_placeholder)
>> ! errcontext("placeholder expression at position %d in select
>> list",
>> ! errpos->cur_attno);
>> A user wouldn't know what a placeholder expression is, there is no such
>> term in the documentation. We have to device a better way to provide an
>> error context for an expression in general.
>>
>
> Though I proposed that, I don't think that it's that important to let
> users know that the expression is from a PHV. How about just saying
> "expression", not "placeholder expression", so that we have the message
> "expression at position %d in select list" in the context?
>
> Hmm, that's better than placeholder expression, but not as explanatory as
it should be since we won't be printing the "select list" in the error
context and user won't have a clue as to what error context is about. But I
don't have any good suggestions right now. May be we should print the whole
expression? But that can be very long, I don't know.

This patch tries to do two things at a time 1. support join pushdown for
FULL join when the joining relations have remote conditions 2. better
support for fetching placeholder vars, whole row references and some system
columns. To make reviews easy, I think we should split the patch into two
1. supporting subqueries to be deparsed with support for one of the above
(I will suggest FULL join support) 2. support for the other.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2016-09-06 13:07:23 Re: Declarative partitioning - another take
Previous Message Amit Langote 2016-09-06 13:04:09 Re: Declarative partitioning - another take