From: | Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp> |
---|---|
To: | Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com> |
Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Push down more full joins in postgres_fdw |
Date: | 2016-10-25 12:21:42 |
Message-ID: | 5aa9ed42-a00f-be85-1c8d-4b10a0b12c7e@lab.ntt.co.jp |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 2016/10/25 18:58, Ashutosh Bapat wrote:
You wrote:
>>> 13. The comment below is missing the main purpose i.e. creating a a unique
>>> alias, in case the relation gets converted into a subquery. Lowest or
>>> highest
>>> relid will create a unique alias at given level of join and that would be
>>> more
>>> future proof. If we decide to consider paths for every join order,
>>> following
>>> comment will no more be true.
>>> + /*
>>> + * Set the relation index. This is defined as the position of this
>>> + * joinrel in the join_rel_list list plus the length of the rtable
>>> list.
>>> + * Note that since this joinrel is at the end of the list when we are
>>> + * called, we can get the position by list_length.
>>> + */
>>> + fpinfo->relation_index =
>>> + list_length(root->parse->rtable) +
>>> list_length(root->join_rel_list);
I wrote:
>> I don't agree on that point. As I said before, the approach using the
>> lowest/highest relid would make a remote query having nested subqueries
>> difficult to read. And even if we decided to consider paths for every join
>> order, we could define the relation_index safely, by modifying
>> postgresGetForeignJoinPaths so that it (1) sets the relation_index the
>> proposed way at the first time it is called and (2) avoids setting the
>> relation_index after the first call, by checking to see
>> fpinfo->relation_index > 0.
> OK. Although, the alias which contains a number, which apparently
> doesn't show any relation with the relation (no pun intended :)) being
> deparsed, won't make much sense in the deparsed query. But I am fine
> leaving this to the committer's judgement.
Fine with me.
> May be you want to add an
> assert here making sure that relation_index is set only once.
> Something like Assert(fpinfo->relation_index == 0) just before setting
> it.
Will do.
>>> 15. While deparsing every Var, we are descending the RelOptInfo hierarchy.
>> Right. I think that not only avoids creating redundant data to find the
>> alias of a given Var node but also would be able to find it in optimal cost.
>>> Instead of that, we should probably gather all the alias information once
>>> and
>>> store it in context as an array indexed by relid. While deparsing a Var we
>>> can
>>> get the targetlist and alias for a given relation by using var->varno as
>>> index.
>>> And then search for given Var node in the targetlist to deparse the column
>>> name
>>> by its position in the targetlist. For the relations that are not
>>> converted
>>> into subqueries, this array will not hold any information and the Var will
>>> be
>>> deparsed as it's being done today.
>> Sorry, I don't understand this part. How does that work when creating a
>> remote query having nested subqueries? Is that able to search for a given
>> Var node efficiently?
> For every relation that is deparsed as a subquery, we will create a
> separate context. Each such context will have an array described
> above. This array will contain the targetlist and aliases for all the
> relations, covered by that relation, that are required to be deparsed
> as subqueries. These arrays bubble up to topmost relation that is not
> required to be deparsed as subquery. When a relation is required to be
> deparsed as a subquery, its immediate upper relation sets the alias
> and targetlist chosen for that relation at all the indexes
> corresponding to all the base relations covered by that relation.
Thanks for the explanation!
> For
> example, let's assume that a relation (1, 2, 3) is required to be
> deparsed as subquery for an immediate upper relation (1, 2, 3, 4, 5)
> (thus the other joining relation being (4,5)). While deparsing for
> relation (1,2,3,4,5), the context will contain a 5 element array, with
> positions 1, 2, 3 filled by same targetlist and alias whereas
> positions 4 and 5 will not be filled as those relations are not
> deparsed as subqueries.
Sorry, I don't understand this. In that case, the immediate upper
relation (1, 2, 3, 4, 5) would need to fill the targetlist and aliases
for *the join relation (1, 2, 3) somewhere*, not the targetlist and
aliases for each of the component relations 1, 2, and 3, because the
join relation is deparsed as a subquery. Maybe I'm missing something,
though.
> Let's assume in relation (1, 2, 3), (1, 3) in
> turn requires subquery but (2) does not. Thus the context created
> while deparsing (1, 2, 3) will have a 3 element array with positions 1
> and 3 containing the same targetlist and alias, where as position 2
> will be empty.
> When deparsing a Var node with varno = N and varattno =
> m, if the nth position in the array in the context is empty, that Var
> node will be deparsed as rN.<column name>.
What happens when deparsing eg, a Var with varno = 2 at the topmost
relation (1, 2, 3, 4, 5)? The second position of the array is empty,
but the join relation (1, 2, 3) is deparsed as a subquery, so the Var
should be deparsed as an alias of an output column of the subquery at
the topmost relation, I think.
> But if that position is has
> alias sZ, then we search for that Var node in the targetlist and if
> it's found at kth position in the targetlist, we will deparse it as
> sZ.ck. The search in the targetlist can be performed using
> tlist_member, and then fetching the position by TargetEntry::resno.
> This does not require any recursion and thus saves stack space and
> some CPU cycles required for recursion.
Is that true?
> some CPU cycles required for recursion. I guess, the arrays need to be
> computed only once for any relation when the query for that relation
> is deparsed the first time.
Does this algorithm extend to the case where we consider paths for every
join order?
> Thanks for considering other suggestions.
Your suggestions are really helpful to improve the patch. Thanks!
Best regards,
Etsuro Fujita
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Eisentraut | 2016-10-25 12:40:43 | Re: macaddr 64 bit (EUI-64) datatype support |
Previous Message | Amit Kapila | 2016-10-25 11:32:54 | Re: Declarative partitioning - another take |