Re: parallel.c is not marked as test covered

From: Noah Misch <noah(at)leadboat(dot)com>
To: Clément Prévost <prevostclement(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: parallel.c is not marked as test covered
Date: 2016-06-07 05:27:16
Message-ID: 20160607052716.GA766951@tornado.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thanks for this patch. I have reviewed it:

On Sun, May 15, 2016 at 12:53:13PM +0000, Clément Prévost wrote:
> After some experiments, I found out that, for my setup (9b7bfc3a88ef7b), a
> parallel seq scan is used given both parallel_setup_cost
> and parallel_tuple_cost are set to 0 and given that the table is at least 3
> times as large as the biggest test table tenk1.

That worked because it pushed the table over the create_plain_partial_paths()
1000-block (8 MiB) threshold; see
http://www.postgresql.org/message-id/26842.1462941248@sss.pgh.pa.us

While the way you tested this is not wrong, I recommend instead querying an
inheritance parent if that achieves no less test coverage. "SELECT count(*)
FROM a_star" gets a parallel plan under your cost parameters. That avoids
building and discarding a relatively-large table.

> The attached patch is a regression test using this method that is
> reasonably small and fast to run. I also hid the workers count from the
> explain output when costs are disabled as suggested by Tom Lane and Robert
> Haas on this same thread (
> http://www.postgresql.org/message-id/CA+TgmobBQS4ss3+CwoZOKgbsBqKfRndwc=hLiALAep5aXQCTDA@mail.gmail.com
> ).

It seems wrong to me for COSTS OFF to disable output that is not a cost
estimate. file_fdw did that, but I don't want to proliferate that loose
interpretation of COSTS OFF.

Unlike the example in the linked-to message, this test won't experience
variable worker count. For the query you used and for the a_star query, low
input size makes the planner request exactly one worker. If later planner
enhancements give this query a configuration-specific worker count, the test
could capture EXPLAIN output with PL/pgSQL and compare it to a regex.

> Testing under these conditions does not test the planner job at all but at
> least some parallel code can be run on the build farm and the test suite
> gets 2643 more lines and 188 more function covered.

Nice.

> --- a/src/test/regress/parallel_schedule
> +++ b/src/test/regress/parallel_schedule
> @@ -79,7 +79,7 @@ ignore: random
> # ----------
> # Another group of parallel tests
> # ----------
> -test: select_into select_distinct select_distinct_on select_implicit select_having subselect union case join aggregates transactions random portals arrays btree_index hash_index update namespace prepared_xacts delete
> +test: select_into select_distinct select_distinct_on select_implicit select_having subselect union case join aggregates transactions random portals arrays btree_index hash_index update namespace prepared_xacts delete select_parallel

Add this to serial_schedule, too.

> --- /dev/null
> +++ b/src/test/regress/sql/select_parallel.sql
> @@ -0,0 +1,22 @@
> +--
> +-- PARALLEL
> +--
> +
> +-- setup parallel test
> +create unlogged table tenk_parallel as table tenk1 with no data;
> +set parallel_setup_cost=0;
> +set parallel_tuple_cost=0;
> +
> +-- create a table with enough data to trigger parallel behavior
> +insert into tenk_parallel
> + select tenk1.* from tenk1, generate_series(0,2);
> +
> +explain (costs off)
> + select count(*) from tenk_parallel;
> +select count(*) from tenk_parallel;

As of today, "make installcheck" passes with "default_transaction_isolation =
serializable" in postgresql.conf. Let's preserve that property. You could
wrap the parallel queries in "begin isolation level repeatable read;"
... "commit;", or you could SET default_transaction_isolation itself.

> +
> +-- cleanup
> +drop table tenk_parallel;
> +set parallel_setup_cost to default;
> +set parallel_tuple_cost to default;
> +

Remove the trailing blank line; it triggers a "git diff --check" diagnostic.

[Official open item status update: I expect to be able to review the next
patch version within a day or two of it becoming available. I will update
this thread by 2016-06-14 09:00 UTC if not earlier.]

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2016-06-07 05:35:28 Re: [BUGS] BUG #14155: bloom index error with unlogged table
Previous Message Haribabu Kommi 2016-06-07 05:19:06 Re: IPv6 link-local addresses and init data type