Re: More efficient build farm animal wakeup?

From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Andrew Dunstan <andrew(at)dunslane(dot)net>
Subject: Re: More efficient build farm animal wakeup?
Date: 2022-11-20 21:31:14
Message-ID: CABUevEysQnc4UqPa--jOrUzD9YUabqvSdPHL371EBFmymqz_dw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Nov 20, 2022 at 4:56 AM Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:

> On Sun, Nov 20, 2022 at 1:35 AM Magnus Hagander <magnus(at)hagander(dot)net>
> wrote:
> > tl,tr; it's not there now, but yes if we can find a smart way for th ebf
> clients to consume it, it is something we could build and deploy fairly
> easily.
>
> Cool -- it sounds a lot like you've thought about this already :-)
>
> About the client: currently run_branches.pl makes an HTTP request for
> the "branches of interest" list. Seems like a candidate point for a
> long poll? I don't think it'd have to be much smarter than it is
> today, it'd just have to POST the commits it already has, I think.
>

Um, branches of interest will only pick up when it gets a new *branch*, not
a new *commit*, so I think that would be a very different problem to solve.
And I don't think we have new branche *that* often...

Perhaps as a first step, the server could immediately report which
> branches to bother fetching, considering the client's existing
> commits. That'd almost always be none, but ~11.7 times per day a new
> commit shows up, and once a year there's a new interesting branch.
> That would avoid the need for the 6 git fetches that usually follow in
> the common case, which admittedly might not be a change worth making
> on its own. After all, the git fetches are probably quite similar
> HTTP requests themselves, except that there 6 of them, one per branch,
> and they hit the public git server instead of some hypothetical
> buildfarm endpoint.
>

As Andres mentioned downthread, that's not a lot more lightweight than what
"git fetch" does.

The thing we'd want to avoid is having to do that so much and often. And
getting to that is going to require modification of the buildfarm client to
make it more "smart" regardless. In particular, making it do this "right"
in the face of multiple branches is probably going to be a big win.

Then you could switch to long polling by letting the client say "if
> currently none, I'm prepared to wait up to X seconds for a different
> answer", assuming you know how to build the server side of that
> (insert magic here). Of course, you can't make it too long or your
> session might be dropped in the badlands between client and server,
> but that's just a reason to make X configurable. I think RFC6202 says
> that 120 seconds probably works fine across most kinds of links, which
> means that you lower the total poll rate hitting the server, but--more
> interestingly for me as a client--you minimise latency when something
> finally happens. (With various keepalive tricks and/or heartbeat
> streaming tricks you could possibly make it much higher, who knows...
> but you'd have to set it very very low to do worse than what we're
> doing today in total request count). Or maybe there is some existing
> easy perl library that could be used for this (joke answer: cpan
> install Twitter::API and follow @pg_commits).
>

I also honestly wonder how big a problem a much longer than 120 seconds
timeout would be in practice. Since we own both the client and the server
in this case, we'd only be at mercy of network equipment in between and I
think we're much less exposed to weirdness there than "the average
browser". Thus, as long as it's configurable, I think we could go for
something much longer by default.

I'd imagine something like a
GET https://git.postgresql.org/buildfarm-branchtips
X-branch-master: a4adc31f69
X-branch-REL_14_STABLE: b33283cbd3
X-longpoll: 120

For that one it would check branch master and rel 14, and if either
branchtip doesn't match what was in the header, it'd return immediately
with a textfile that's basically
master:<whateveritis>

if master has changed and not REL_14.

If nothing has changed, go into longpoll for 120 seconds based on the
header, and if nothing at all has changed in that time, return a 304.

We could also use something like a websocket to just stream the changes out
over.

In either case it would also need to change the buildfarm client to run as
a daemon rather than a cronjob I think? (obviously optional, we don't have
to remove the current abilities)

However, when I started this thread I was half expecting such a thing
> to exist already, somewhere, I just haven't been able to find it
> myself... Don't other people have this problem? Maybe everybody who
> has this problem uses webhooks (git server post commit hook opens
> connection to client) as you mentioned, but as you also mentioned
> that'd never fly for our topology.
>

Yeah, webhook seems to be what most people use.

FWIW, an implementation for us would be a small daemon that receives such
webhooks from our git server and redistributtes it for the long polling.
That's still the easiest way to get the data out of git itself...

//Magnus

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2022-11-20 21:31:15 Re: Reducing power consumption on idle servers
Previous Message Bauyrzhan Sakhariyev 2022-11-20 21:27:19 Precedence of bitwise operators