Re: [HACKERS] why not parallel seq scan for slow functions

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Marina Polyakova <m(dot)polyakova(at)postgrespro(dot)ru>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Subject: Re: [HACKERS] why not parallel seq scan for slow functions
Date: 2018-03-13 18:31:40
Message-ID: CA+TgmoakT5gmahbPWGqrR2nAdFOMAOnOXYoWHRdVfGWs34t6_A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Feb 14, 2018 at 5:37 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> Your concern is valid, but isn't the same problem exists in another
> approach as well, because in that also we can call
> adjust_paths_for_srfs after generating gather path which means that we
> might lose some opportunity to reduce the relative cost of parallel
> paths due to tlists having SRFs. Also, a similar problem can happen
> in create_order_paths for the cases as described in the example
> above.

You're right. I think cleaning all of this up for v11 is too much to
consider, but please tell me your opinion of the attached patch set.
Here, instead of the ripping the problematic logic out of
apply_projection_to_path, what I've done is just removed several of
the callers to apply_projection_to_path. I think that the work of
removing other callers to that function could be postponed to a future
release, but we'd still get some benefit now, and this shows the
direction I have in mind. I'm going to explain what the patches do
one by one, but out of order, because I backed into the need for the
earlier patches as a result of troubleshooting the later ones in the
series. Note that the patches need to be applied in order even though
I'm explaining them out of order.

0003 introduces a new upper relation to represent the result of
applying the scan/join target to the topmost scan/join relation. I'll
explain further down why this seems to be needed. Since each path
must belong to only one relation, this involves using
create_projection_path() for the non-partial pathlist as we already do
for the partial pathlist, rather than apply_projection_to_path().
This is probably marginally more expensive but I'm hoping it doesn't
matter. (However, I have not tested.) Each non-partial path in the
topmost scan/join rel produces a corresponding path in the new upper
rel. The same is done for partial paths if the scan/join target is
parallel-safe; otherwise we can't.

0004 causes the main part of the planner to skip calling
generate_gather_paths() from the topmost scan/join rel. This logic is
mostly stolen from your patch. If the scan/join target is NOT
parallel-safe, we perform generate_gather_paths() on the topmost
scan/join rel. If the scan/join target IS parallel-safe, we do it on
the post-projection rel introduced by 0003 instead. This is the patch
that actually fixes Jeff's original complaint.

0005 goes a bit further and removes a bunch of logic from
create_ordered_paths(). The removed logic tries to satisfy the
query's ordering requirement via cheapest_partial_path + Sort + Gather
Merge. Instead, it adds an additional path to the new upper rel added
in 0003. This again avoids a call to apply_projection_to_path() which
could cause projection to be pushed down after costing has already
been fixed. Therefore, it gains the same advantages as 0004 for
queries that would this sort of plan. Currently, this loses the
ability to set limit_tuples for the Sort path; that doesn't look too
hard to fix but I haven't done it. If we decide to proceed with this
overall approach I'll see about getting it sorted out.

Unfortunately, when I initially tried this approach, a number of
things broke due to the fact that create_projection_path() was not
exactly equivalent to apply_projection_to_path(). This initially
surprised me, because create_projection_plan() contains logic to
eliminate the Result node that is very similar to the logic in
apply_projection_to_path(). If apply_projection_path() sees that the
subordinate node is projection-capable, it applies the revised target
list to the child; if not, it adds a Result node.
create_projection_plan() does the same thing. However,
create_projection_plan() can lose the physical tlist optimization for
the subordinate node; it forces an exact projection even if the parent
node doesn't require this. 0001 fixes this, so that we get the same
plans with create_projection_path() that we would with
apply_projection_to_path(). I think this patch is a good idea
regardless of what we decide to do about the rest of this, because it
can potentially avoid losing the physical-tlist optimization in any
place where create_projection_path() is used.

It turns out that 0001 doesn't manage to preserve the physical-tlist
optimization when the projection path is attached to an upper
relation. 0002 fixes this.

If we decide to go forward with this approach, it may makes sense to
merge some of these when actually committing, but I found it useful to
separate them for development and testing purposes, and for clarity
about what was getting changed at each stage. Please let me know your
thoughts.

Thanks,

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment Content-Type Size
0005-Remove-explicit-path-construction-logic-in-create_or.patch application/octet-stream 4.3 KB
0004-Postpone-generate_gather_paths-for-topmost-scan-join.patch application/octet-stream 7.1 KB
0003-Add-new-upper-rel-to-represent-projecting-toplevel-s.patch application/octet-stream 6.3 KB
0002-Adjust-use_physical_tlist-to-work-on-upper-rels.patch application/octet-stream 1.6 KB
0001-Teach-create_projection_plan-to-omit-projection-wher.patch application/octet-stream 3.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2018-03-13 18:36:44 Re: JIT compiling with LLVM v11
Previous Message Andres Freund 2018-03-13 18:30:56 Re: JIT compiling with LLVM v11