Re: parallel.c is not marked as test covered

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: parallel.c is not marked as test covered
Date: 2016-05-11 17:38:33
Message-ID: CA+Tgmoa+NPuSdVpMTP68H_tSWvCrKCJvwV_Cb+tuGQaWzeWB-Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, May 11, 2016 at 12:34 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Hmm, that is strange. I would have expected that to stuff a Gather on
>> top of the Aggregate. I wonder why it's not doing that.
>
> The reason is that create_plain_partial_paths() contains a hard-wired
> decision not to generate any partial paths for relations smaller than
> 1000 blocks, which means that no partial path will ever be generated
> for *any* relation in the standard regression tests, force_parallel_mode
> or no.

Well that's an interesting theory, except that you've completely
missed the point of force_parallel_mode. force_parallel_mode pushes a
special Gather node on top of any plan that is not already parallel in
some way but which is parallel-safe. That special Gather node runs
only in the worker, not the leader, and always uses just one worker.
The point is to test that queries run in the worker in the same way
that they do in the leader. This has already found lots of bugs, so
it's clearly useful, despite all the confusion it's created.

> I would just go fix this, along the lines of
>
> *************** create_plain_partial_paths(PlannerInfo *
> *** 702,708 ****
> * with all of its inheritance siblings it may well pay off.
> */
> if (rel->pages < parallel_threshold &&
> ! rel->reloptkind == RELOPT_BASEREL)
> return;
>
> /*
> --- 703,710 ----
> * with all of its inheritance siblings it may well pay off.
> */
> if (rel->pages < parallel_threshold &&
> ! rel->reloptkind == RELOPT_BASEREL &&
> ! force_parallel_mode == FORCE_PARALLEL_OFF)
> return;
>
> /*
>
> except that doing so and running the regression tests with
> force_parallel_mode = on results in core dumps.

Nonetheless, that is a bug. (I only get one crash - do you get more?)

> Some debugging seems
> indicated ... and we should at this point assume that there has been no
> useful testing of parallel query in the buildfarm, not even on Noah's
> machines.

Yes and no. The force_parallel_mode stuff is primarily there to tell
us about things like "you marked a function as parallel-safe, but it
actually blows up when run in a worker". It won't catch a case where
a function is marked parallel-safe but silently does the wrong thing
instead of failing, but it will catch quite a few mistakes of that
kind, and I think that's good.

What it won't do, though, is actually run "real" parallel queries -
that is, multiple workers doing a Parallel Seq Scan with some other
stuff pushed on top, and then a Gather. And we should have test
coverage for that, too, so that we're testing real concurrent
behavior, and not just our ability to run plans in a worker.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Christopher Browne 2016-05-11 17:45:22 Re: Academic help for Postgres
Previous Message Kevin Grittner 2016-05-11 17:34:58 Re: Does Type Have = Operator?