Re: [Pgbuildfarm-members] latest buildfarm client release

From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [Pgbuildfarm-members] latest buildfarm client release
Date: 2015-11-22 15:37:08
Message-ID: 5651E124.30508@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: buildfarm-members pgsql-hackers

On 11/22/2015 12:47 AM, Tom Lane wrote:
> Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
>> I have just released version 4.16 of the PostgreSQL Buildfarm client
> I updated my critters to 4.16, and since nothing much was happening in
> git, decided to test by doing "run_build.pl --nosend --verbose --force"
> manually on prairiedog. That run went fine, but the cron job firing
> run_branches.pl every few minutes was still live, and one of its runs
> went a tad nuts even though nothing was happening in git:
>
> Buildfarm member prairiedog failed on REL9_3_STABLE stage pgsql-Git
> Buildfarm member prairiedog failed on REL9_4_STABLE stage pgsql-Git
> Buildfarm member prairiedog failed on REL9_5_STABLE stage pgsql-Git
>
> That resulted in these reports uploaded to the server:
>
> http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=prairiedog&dt=2015-11-21%2018%3A27%3A29
> http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=prairiedog&dt=2015-11-21%2018%3A27%3A19
> http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=prairiedog&dt=2015-11-21%2018%3A26%3A12
>
> which contain the following failure reports, respectively:
>
> Missing checked out branch bf_REL9_5_STABLE:
> fatal: Not a git repository (or any of the parent directories): .git
>
> Missing checked out branch bf_REL9_4_STABLE:
> fatal: Not a git repository (or any of the parent directories): .git
>
> fatal: unable to read tree 719b1b413b507d0fc86162f6aa45b6e44e6d82a1
> Cannot rebase: Your index contains uncommitted changes.
> Please commit or stash them.
>
> None of that makes any possible sense, because I certainly wasn't touching
> the git tree by hand, and the run_build job was only touching HEAD.
> There's nothing really broken on the machine, because the next set of
> runs went through fine.
>
> Don't know what to make of this, except that probably the buildfarm
> script's concurrent-job interlocks need some attention.
>
>

Oh, ouch. Well, that message comes from us just doing "git branch" to
sanity check what branch we're on, and that happens before anything that
was changed in this release. I had assumed, possibly naively, that git
would lock against itself. Maybe not with multiple workdirs. This only
matters if you're using git_use_workdirs, like you are, since otherwise
the git repos are totally independent, and run_build is definitely
locked against itself on a given branch. I'll look at adding a global
wait lock, just while git checkout is running, to cover this case. In
normal operation we don't expect this to occur, since run_branches.pl
just runs branches one at a time, so I don't think we need to put out an
emergency fix, but you've uncovered a corner case that all my testing
has missed.

Thanks for the report.

cheers

andrew

In response to

Browse buildfarm-members by date

  From Date Subject
Next Message Andrew Dunstan 2016-01-18 22:20:46 Buildfarm server move
Previous Message Tom Lane 2015-11-22 05:47:35 Re: [Pgbuildfarm-members] latest buildfarm client release

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2015-11-22 16:16:34 Re: Freeze avoidance of very large table.
Previous Message David Rowley 2015-11-22 10:20:23 Re: WIP: Make timestamptz_out less slow.