From: | Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Pg Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Question about commit 11cf92f6e2e13c0a6e3f98be3e629e6bd90b74d5 |
Date: | 2019-03-05 08:00:43 |
Message-ID: | 5C7E2CAB.2040001@lab.ntt.co.jp |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
(2019/03/02 1:10), Robert Haas wrote:
> On Fri, Mar 1, 2019 at 5:47 AM Etsuro Fujita
> <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
>> Robert, I CCed you because you are the author of that commit. Before
>> that commit ("Rewrite the code that applies scan/join targets to
>> paths."), apply_scanjoin_target_to_paths() had a boolean parameter named
>> modify_in_place, and used apply_projection_to_path(), not
>> create_projection_path(), to adjust scan/join paths when modify_in_place
>> was true, which allowed us to save cycles at plan creation time by
>> avoiding creating projection paths, which I think would be a good thing,
>> but that commit removed that. Why?
>
> One of the goals of the commit was to properly account for the cost of
> computing the target list. Before parallel query and partition-wise
> join, it didn't really matter what the cost of computing the target
> list was, because every path was going to have to do the same work, so
> it was just a constant factor getting added to every path. However,
> parallel query and partition-wise join mean that some paths can
> compute the final target list more cheaply than others, and that turns
> out to be important for things like PostGIS. One of the complaints
> that provoked that change was that PostGIS was picking non-parallel
> plans even when a parallel plan was substantially superior, because it
> wasn't accounting for the fact that in the parallel plan, the cost of
> computing the target-list could be shared across all the workers
> rather than paid entirely by the leader.
>
> In order to accomplish this goal of properly accounting for the cost
> of computing the target list, we need to create a new path, not just
> jam the target list into an already-costed path. Note that we did
> some performance optimization around the same time to minimize the
> performance hit here (see d7c19e62a8e0a634eb6b29f8f1111d944e57081f,
> and I think there may have been something else as well although I
> can't find it right now).
apply_projection_to_path() not only jams the given tlist into the
existing path but updates its tlist eval costs appropriately except for
the cases of Gather and GatherMerge:
/*
* If the path happens to be a Gather or GatherMerge path, we'd like to
* arrange for the subpath to return the required target list so that
* workers can help project. But if there is something that is not
* parallel-safe in the target expressions, then we can't.
*/
if ((IsA(path, GatherPath) ||IsA(path, GatherMergePath)) &&
is_parallel_safe(root, (Node *) target->exprs))
{
/*
* We always use create_projection_path here, even if the
subpath is
* projection-capable, so as to avoid modifying the subpath in
place.
* It seems unlikely at present that there could be any other
* references to the subpath, but better safe than sorry.
*
--> * Note that we don't change the parallel path's cost estimates; it
--> * might be appropriate to do so, to reflect the fact that the
bulk of
--> * the target evaluation will happen in workers.
*/
if (IsA(path, GatherPath))
{
GatherPath *gpath = (GatherPath *) path;
gpath->subpath = (Path *)
create_projection_path(root,
gpath->subpath->parent,
gpath->subpath,
target);
}
else
{
GatherMergePath *gmpath = (GatherMergePath *) path;
gmpath->subpath = (Path *)
create_projection_path(root,
gmpath->subpath->parent,
gmpath->subpath,
target);
}
}
>> The real reason for this question is: I noticed that projection paths
>> put on foreign paths will make it hard for FDWs to detect whether there
>> is an already-well-enough-sorted remote path in the pathlist for the
>> final scan/join relation as an input relation to GetForeignUpperPaths()
>> for the UPPERREL_ORDERED step (the IsA(path, ForeignPath) test would not
>> work well enough to detect remote paths!), so I'm wondering whether we
>> could revive that parameter like the attached, to avoid the overhead at
>> plan creation time and to make the FDW work easy. Maybe I'm missing
>> something, though.
>
> I think this would be a bad idea, for the reasons explained above. I
> also think that it's probably the wrong direction on principle. I
> think the way we account for target lists is still pretty crude and
> needs to be thought of as more of a real planning step and less as
> something that we can just ignore when it's inconvenient for some
> reason.
I'm not sure I 100% agree with you, but I also think we need to give
more thought to the tlist-eval-cost adjustment.
> I think the FDW just needs to look through the projection
> path and see what's underneath, but also take the projection path's
> target list into account when decide whether more can be pushed down.
OK, I'll go with that.
Thanks for the explanation!
Best regards,
Etsuro Fujita
From | Date | Subject | |
---|---|---|---|
Next Message | David Steele | 2019-03-05 08:10:36 | Re: Re: [PATCH] pgbench tap tests fail if the path contains a perl special character |
Previous Message | David Steele | 2019-03-05 07:57:25 | Re: Re: \describe* |