Re: Duplicate Workers entries in some EXPLAIN plans

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Maciek Sakrejda <m(dot)sakrejda(at)gmail(dot)com>, Georgios Kokolatos <gkokolatos(at)pm(dot)me>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Duplicate Workers entries in some EXPLAIN plans
Date: 2020-01-26 00:02:58
Message-ID: 20200126000258.sow7e5yaavokhc2v@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2020-01-25 14:23:50 -0500, Tom Lane wrote:
> A side effect of this change is that per-worker JIT info is now
> printed one plan level further down: before it was printed on
> the Gather node, but now it's attached to the Gather's child,
> because that's where we print other per-worker data. I don't
> see anything particularly wrong with that, but it's another
> change from the behavior today.

Yea, I don't see any need to be bothered by that.

> It's still really unclear to me how we could exercise any of
> this behavior meaningfully in a regression test. I thought
> for a little bit about using the TAP infrastructure instead
> of a traditional-style test, but it seems like that doesn't
> buy anything except for a bias towards ignoring details instead
> of overspecifying them. Which is not much of an improvement.

Hm. I'd like to avoid needing TAP for this kind of thing, it'll just
make it much more expensive in just about all ways.

Testing JIT explain is "easy" enough I think, I've posted a patch in
another thread, which just skips over the region of the test if JIT is
not available. See e.g. 0008 in
https://www.postgresql.org/message-id/20191029000229.fkjmuld3g7f2jq7i%40alap3.anarazel.de
(that's a thread I'd love your input in btw)

It's harder for parallel query though. If parallel query were able to
reuse workers, we could "just" check at the beginning of the test if we
are able to get the workers we need, and then skip the rest of the tests
if not. But as things stand that doesn't guarantee anything.

I wonder if we could introduce a debug GUC that makes parallel worker
acquisition just retry in a loop, for a time determined by the GUC. That
obviously would be a bad idea to do in a production setup, but it could
be good enough for regression tests? There are some deadlock dangers,
but I'm not sure they really matter for the tests.

> + /* prepare per-worker general execution details */
> + if (es->workers_state && es->verbose)
> + {
> + WorkerInstrumentation *w = planstate->worker_instrument;
> +
> + for (int n = 0; n < w->num_workers; n++)
> + {
> + Instrumentation *instrument = &w->instrument[n];
> + double nloops = instrument->nloops;
> + double startup_ms;
> + double total_ms;
> + double rows;
> +
> + if (nloops <= 0)
> + continue;
> + startup_ms = 1000.0 * instrument->startup / nloops;
> + total_ms = 1000.0 * instrument->total / nloops;
> + rows = instrument->ntuples / nloops;
> +
> + ExplainOpenWorker(n, es);
> +
> + if (es->format == EXPLAIN_FORMAT_TEXT)
> + {
> + ExplainIndentText(es);
> + if (es->timing)
> + appendStringInfo(es->str,
> + "actual time=%.3f..%.3f rows=%.0f loops=%.0f\n",
> + startup_ms, total_ms, rows, nloops);
> + else
> + appendStringInfo(es->str,
> + "actual rows=%.0f loops=%.0f\n",
> + rows, nloops);
> + }
> + else
> + {
> + if (es->timing)
> + {
> + ExplainPropertyFloat("Actual Startup Time", "ms",
> + startup_ms, 3, es);
> + ExplainPropertyFloat("Actual Total Time", "ms",
> + total_ms, 3, es);
> + }
> + ExplainPropertyFloat("Actual Rows", NULL, rows, 0, es);
> + ExplainPropertyFloat("Actual Loops", NULL, nloops, 0, es);
> + }
> +
> + ExplainCloseWorker(n, es);
> + }
> + }

I'd personally move this into a separate function, given the patches
moves the code around already. ExplainNode() is already hard enough to
navigate...

It probably also makes sense to move everything but the nloops <= 0,
ExplainOpenWorker/ExplainCloseWorker into its own function. As far as I
can tell it now should be identical between the non-parallel case?

> +/*
> + * Begin or resume output into the set-aside group for worker N.
> + */
> +static void

Would it make sense to make these functions non-static? It seems
plausible that code for a custom node or such would want to add
its own information?

> +ExplainOpenWorker(int n, ExplainState *es)
> +{
> + ExplainWorkersState *wstate = es->workers_state;
> +
> + Assert(wstate);
> + Assert(n >= 0 && n < wstate->num_workers);
> +
> + /* Save prior output buffer pointer */
> + wstate->prev_str = es->str;
> +
> + if (!wstate->worker_inited[n])
> + {
> + /* First time through, so create the buffer for this worker */
> + initStringInfo(&wstate->worker_str[n]);
> + es->str = &wstate->worker_str[n];
> +
> + /*
> + * Push suitable initial formatting state for this worker's field
> + * group. We allow one extra logical nesting level, since this group
> + * will eventually be wrapped in an outer "Workers" group.
> + */
> + ExplainOpenSetAsideGroup("Worker", NULL, true, 2, es);
> +
> + /*
> + * In non-TEXT formats we always emit a "Worker Number" field, even if
> + * there's no other data for this worker.
> + */
> + if (es->format != EXPLAIN_FORMAT_TEXT)
> + ExplainPropertyInteger("Worker Number", NULL, n, es);
> +
> + wstate->worker_inited[n] = true;
> + }
> + else
> + {
> + /* Resuming output for a worker we've already emitted some data for */
> + es->str = &wstate->worker_str[n];
> +
> + /* Restore formatting state saved by last ExplainCloseWorker() */
> + ExplainRestoreGroup(es, 2, &wstate->worker_state_save[n]);
> + }
> +
> + /*
> + * In TEXT format, prefix the first output line for this worker with
> + * "Worker N:". Then, any additional lines should be indented one more
> + * stop than the "Worker N" line is.
> + */
> + if (es->format == EXPLAIN_FORMAT_TEXT)
> + {
> + if (es->str->len == 0)
> + {
> + ExplainIndentText(es);
> + appendStringInfo(es->str, "Worker %d: ", n);
> + }
> +
> + es->indent++;
> + }
> +}

I don't quite get the Worker %d bit. Why are we not outputting that in
the !worker_inited block?

> +/*
> + * Print per-worker info for current node, then free the ExplainWorkersState.
> + */
> +static void
> +ExplainFlushWorkersState(ExplainState *es)
> +{
> + ExplainWorkersState *wstate = es->workers_state;
> +
> + ExplainOpenGroup("Workers", "Workers", false, es);
> + for (int i = 0; i < wstate->num_workers; i++)
> + {
> + if (wstate->worker_inited[i])
> + {
> + /* This must match previous ExplainOpenSetAsideGroup call */
> + ExplainOpenGroup("Worker", NULL, true, es);
> + appendStringInfoString(es->str, wstate->worker_str[i].data);

Probably never matters, but given we do have the string length already,
we could use appendBinaryStringInfo().

> + ExplainCloseGroup("Worker", NULL, true, es);

Not related to this patch: I never got why we maintain a grouping stack
for some things, but don't do it for the group properties
themselves.

> /*
> + * Open a group of related objects, without emitting actual data.
> + *
> + * Prepare the formatting state as though we were beginning a group with
> + * the identified properties, but don't actually emit anything. Output
> + * subsequent to this call can be redirected into a separate output buffer,
> + * and then eventually appended to the main output buffer after doing a
> + * regular ExplainOpenGroup call (with the same parameters).
> + *
> + * The extra "depth" parameter is the new group's depth compared to current.
> + * It could be more than one, in case the eventual output will be enclosed
> + * in additional nesting group levels. We assume we don't need to track
> + * formatting state for those levels while preparing this group's output.
> + *
> + * There is no ExplainCloseSetAsideGroup --- in current usage, we always
> + * pop this state with ExplainSaveGroup.
> + */
> +static void
> +ExplainOpenSetAsideGroup(const char *objtype, const char *labelname,
> + bool labeled, int depth, ExplainState *es)
> +{
> + switch (es->format)
> + {
> + case EXPLAIN_FORMAT_TEXT:
> + /* nothing to do */
> + break;
> +
> + case EXPLAIN_FORMAT_XML:
> + es->indent += depth;
> + break;
> +
> + case EXPLAIN_FORMAT_JSON:
> + es->grouping_stack = lcons_int(0, es->grouping_stack);
> + es->indent += depth;
> + break;
> +
> + case EXPLAIN_FORMAT_YAML:
> + if (labelname)
> + es->grouping_stack = lcons_int(1, es->grouping_stack);
> + else
> + es->grouping_stack = lcons_int(0, es->grouping_stack);
> + es->indent += depth;
> + break;
> + }
> +}

Hm. It might be worthwhile to rename ExplainOpenSetAsideGroup and use it
from ExplainOpenGroup()? Seems we could just call it after
ExplainOpenGroup()'s switch, and remove all the indent/grouping_stack
related code from ExplainOpenGroup().

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2020-01-26 00:30:33 Re: Duplicate Workers entries in some EXPLAIN plans
Previous Message Tom Lane 2020-01-25 23:20:16 Re: Duplicate Workers entries in some EXPLAIN plans