Re: Race to build pg_isolation_regress in "make -j check-world"

From: Noah Misch <noah(at)leadboat(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Race to build pg_isolation_regress in "make -j check-world"
Date: 2018-01-02 01:39:56
Message-ID: 20180102013956.GA898314@rfd.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Dec 16, 2017 at 08:06:05PM -0800, Noah Misch wrote:
> On Mon, Nov 06, 2017 at 12:07:52AM -0800, Noah Misch wrote:

> I now see similar trouble from multiple
> "make" processes running "make -C contrib/test_decoding install" concurrently.
> This is a risk for any directory named in an EXTRA_INSTALL variable of more
> than one makefile. Under the right circumstances, this would affect
> contrib/hstore and others in addition to contrib/test_decoding. That brings
> me back to the locking idea:
>
> > The problem of multiple "make" processes in a directory (especially src/port)
> > shows up elsewhere. In a cleaned tree, "make -j -C src/bin" or "make -j
> > installcheck-world" will do it. For more-prominent use cases, src/Makefile
> > prevents this with ".NOTPARALLEL:" and building first the directories that are
> > frequent submake targets. Perhaps we could fix the general problem with
> > directory locking; targets that call "$(MAKE) -C FOO" would first sleep until
> > FOO's lock is available. That could be tricky to make robust.
>
> If one is willing to assume that a lock-holding process never crashes, locking
> in a shell script is simple: mkdir to lock, rmdir to unlock. I don't want to
> assume that. The bakery algorithm provides convenient opportunities for
> checking whether the last locker crashed; I have attached a shell script
> demonstrating this approach. Better ideas? Otherwise, I'll look into
> integrating this design into the makefiles.

Performance has been the principal snare. I value "make -j" being fast when
there's little to rebuild, but that shell script approach slowed an empty
build by 340% (GNU/Linux w/ SSD) to 2300% (Cygwin). In a build having nothing
to do, merely adding a no-op wrapper around "make -C" (does nothing but
execvp() the real GNU make) slowed the build by over 10%[1]. To get what I
considered decent performance took several design changes:

1. Replace the shell script implementation of the bakery algorithm with a C
program that issues fcntl(F_SETLK) on the target directory's makefile.

2. Stop re-entering widely-referenced directories like src/common and src/port
dozens of times per build. Instead, enter them before any "make -C", then
assume they're built if $(MAKELEVEL) is nonzero. This reduced the total
number of "make -C" calls and lock acquisitions from 276 to 147.

3. Lock only directories known to be entered more than once per top-level
target. Preliminarily, this reduced the 147 lock acquisitions to 25.

I regret that (3) entails ongoing maintenance of a whitelist of such
directories; adding a "make -C" call can expand the list. However, I can
automate verification of that whitelist. If I abandon (3), an empty build in
an NFS-mounted source directory slows from 6.4s to 22s; with all three design
changes intact, it slows to 6.6s. I think that justifies keeping (3).

I considered abandoning (1). An empty build would then slow from 6.6s to 7.7s
on NFS-mounted sources, and it would slow from 2.1s to 6.8s on Cygwin. (I
expect similar on MSYS.) That is perhaps acceptable. It would save a few
lines of code and perhaps avoid portability snags. Unlike abandoning (3), I
would not expect long-term maintenance savings. For systems with low PID
entropy[2], (1) also eliminates unreliable code for detecting that a lock
holder has died. I plan to keep all three design changes.

The non-performance problems that arose were easy to fix:

a. In addition to avoiding overlapping "make install" invocations in each
directory, we must ensure no directory is installed more than once per
tmp_install. Otherwise, the second "make install" could unlink or
overwrite a file while another process runs a test that uses the file. I
fixed this with stamp files like we use elsewhere in Makefile.global.

b. Our "make -C" call graph has cycles[3]. To avoid deadlocks, I accumulated
lock-holding PIDs in an environment variable. If the holder of a desired
lock appears in that variable, then that holder is sleeping until after the
present process completes. We can then proceed without a lock acquisition.

Comments? Otherwise, I'll look at finishing the patch series.

Thanks,
nm

[1] I was surprised to see that GNU make is this efficient at determining
there's nothing to rebuild.

[2] For example, Cygwin reuses PIDs as often as every 645 processes. Also,
Cygwin processes have both a Cygwin PID and a Windows PID, and kill() can
match either PID. The shell script relied on kill(PID, 0) to deduce that
a lock holder had died. We may have ended up needing a secondary factor,
like process start time, on such systems.

[3] For example, src/backend depends on libpgport_srv.a, and src/port depends
on submake-errcodes of src/backend.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nikita Glukhov 2018-01-02 02:04:29 Re: [HACKERS] SQL/JSON in PostgreSQL
Previous Message Andres Freund 2018-01-01 22:42:28 Re: pgsql: Add parallel-aware hash joins.