Re: postgres_fdw vs. force_parallel_mode on ppc

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Simon Riggs <simon(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: postgres_fdw vs. force_parallel_mode on ppc
Date: 2016-02-16 00:31:40
Message-ID: CA+TgmoZ3EJCWB_7Yik=JEJiJO__GAyxYz4CjXHoH6XRkXQP5=g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Feb 15, 2016 at 5:52 PM, Noah Misch <noah(at)leadboat(dot)com> wrote:
> On Mon, Feb 08, 2016 at 02:49:27PM -0500, Robert Haas wrote:
>> Well, what I've done is push into the buildfarm code that will allow
>> us to do *the most exhaustive* testing that I know how to do in an
>> automated fashion. Which is to create a file that says this:
>>
>> force_parallel_mode=regress
>> max_parallel_degree=2
>>
>> And then run this: make check-world TEMP_CONFIG=/path/to/aforementioned/file
>>
>> Now, that is not going to find bugs in the deadlock.c portion of the
>> group locking patch, but it's been wildly successful in finding bugs
>> in other parts of the parallelism code, and there might well be a few
>> more that we haven't found yet, which is why I'm hoping that we'll get
>> this procedure running regularly either on all buildfarm machines, or
>> on some subset of them, or on new animals that just do this.
>
> I configured a copy of animal "mandrill" that way and launched a test run.
> The postgres_fdw suite failed as attached. A manual "make -C contrib
> installcheck" fails the same way on a ppc64 GNU/Linux box, but it passes on
> x86_64 and aarch64. Since contrib test suites don't recognize TEMP_CONFIG,
> check-world passes everywhere.

Oh, crap. I didn't realize that TEMP_CONFIG didn't affect the contrib
test suites. Is there any reason for that, or is it just kinda where
we ended up?

Retrying it the way you did it, I see the same errors here, so I think
this isn't a PPC-specific problem, but just a problem in general.
I've actually seen these kinds of errors before in earlier versions of
the testing code that eventually became force_parallel_mode. I got
fooled into believing I'd fixed the problem because of my confusion
about how TEMP_CONFIG worked. I think this is more likely to be a bug
in force_parallel_mode than a bug in the code that checks whether a
normal parallel query is safe, but I'll have to track it down before I
can say for sure.

Thanks for testing this. It's not delightful to discover that I
muffed this, but better to find it now than in 6 months.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Noah Misch 2016-02-16 00:38:44 Re: postgres_fdw vs. force_parallel_mode on ppc
Previous Message Amit Langote 2016-02-16 00:28:23 Re: Declarative partitioning