Observations in Parallel Append

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Observations in Parallel Append
Date: 2017-12-22 14:18:08
Message-ID: CAA4eK1+qcbeai3coPpRW=GFCzFeLUsuY4T-AKHqMjxpEGZBPQg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Few observations in Parallel Append commit (ab727167)

1.
+++ b/src/include/nodes/execnodes.h
@@ -21,6 +21,7 @@
#include "lib/pairingheap.h"
#include "nodes/params.h"
#include "nodes/plannodes.h"
+#include "storage/spin.h"
..

There doesn't seem to be any need for including spin.h. I think some
prior version of the patch might need it. Patch attached to remove
it.

2.
+ * choose_next_subplan_for_worker
..
+ * We start from the first plan and advance through the list;
+ * when we get back to the end, we loop back to the first
+ * nonpartial plan.
..
+choose_next_subplan_for_worker(AppendState *node)
{
..
+ if (pstate->pa_next_plan < node->as_nplans - 1)
+ {
+ /* Advance to next plan. */
+ pstate->pa_next_plan++;
+ }
+ else if (append->first_partial_plan < node->as_nplans)
+ {
+ /* Loop back to first partial plan. */
+ pstate->pa_next_plan = append->first_partial_plan;
+ }
..

The code and comment don't seem to match. The comments indicate that
after reaching the end of the list, we loop back to first nonpartial
plan whereas code indicates that we loop back to first partial plan.
I think one of those needs to be changed unless I am missing something
obvious.

3.
+cost_append(AppendPath *apath)
{
..
+ /*
+ * Apply parallel divisor to non-partial subpaths. Also add the
+ * cost of partial paths to the total cost, but ignore non-partial
+ * paths for now.
+ */
+ if (i < apath->first_partial_path)
+ apath->path.rows += subpath->rows / parallel_divisor;
+ else
+ {
+ apath->path.rows += subpath->rows;
+ apath->path.total_cost += subpath->total_cost;
+ }
..
}

I think it is better to use clamp_row_est for rows for the case where
we use parallel_divisor so that the value of rows is always sane.
Also, don't we need to use parallel_divisor for partial paths instead
of non-partial paths as those will be actually distributed among
workers?

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachment Content-Type Size
spurious_inclusion_v1.patch application/octet-stream 405 bytes

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Oliver Ford 2017-12-22 14:28:52 Re: Add RANGE with values and exclusions clauses to the Window Functions
Previous Message Pavel Stehule 2017-12-22 14:02:23 Re: Using ProcSignal to get memory context stats from a running backend