From: | Noah Misch <noah(at)leadboat(dot)com> |
---|---|
To: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Race to build pg_isolation_regress in "make -j check-world" |
Date: | 2017-11-06 08:07:52 |
Message-ID: | 20171106080752.GA1298146@rfd.leadboat.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
I've been enjoying the speed of parallel check-world, but I get spurious
failures from makefile race conditions. Commit c66b438 fixed the simple ones.
More tricky is this problem of multiple "make" processes entering
src/test/regress concurrently, which causes failures like these:
gcc: error: pg_regress.o: No such file or directory
make[4]: *** [pg_isolation_regress] Error 1
/bin/sh: ../../../src/test/isolation/pg_isolation_regress: Permission denied
make -C test_extensions check
make[2]: *** [check] Error 126
make[2]: Leaving directory `/home/nm/src/pg/backbranch/10/src/test/isolation'
/bin/sh: ../../../../src/test/isolation/pg_isolation_regress: Text file busy
make[3]: *** [isolationcheck] Error 126
make[3]: Leaving directory `/home/nm/src/pg/backbranch/10/src/test/modules/snapshot_too_old'
This is reproducible since commit 2038bf4 or earlier; "make -j check-world"
had worse problems before that era. A workaround is to issue "make -j; make
-j -C src/test/isolation" before the check-world. This problem doesn't affect
src/test/regress/pg_regress. Every top-level "make" or "make install",
including temp-install, builds pg_regress.
I tried fixing this by building src/test/isolation at the same times we run
install-temp. Naturally, that didn't help installcheck-world. It also caused
multiple "make" processes to enter src/port concurrently. I could fix both
check-world and installcheck-world with the attached hack of building
src/test/isolation during every top-level build or install.
The problem of multiple "make" processes in a directory (especially src/port)
shows up elsewhere. In a cleaned tree, "make -j -C src/bin" or "make -j
installcheck-world" will do it. For more-prominent use cases, src/Makefile
prevents this with ".NOTPARALLEL:" and building first the directories that are
frequent submake targets. Perhaps we could fix the general problem with
directory locking; targets that call "$(MAKE) -C FOO" would first sleep until
FOO's lock is available. That could be tricky to make robust.
For now, I propose back-patching the attached, sad hack. Better ideas?
Thanks,
nm
Attachment | Content-Type | Size |
---|---|---|
isolation-build-v1.patch | text/plain | 1013 bytes |
From | Date | Subject | |
---|---|---|---|
Next Message | David Rowley | 2017-11-06 09:16:43 | Removing useless DISTINCT clauses |
Previous Message | Ashutosh Bapat | 2017-11-06 06:51:13 | Re: dropping partitioned tables without CASCADE |