Cirrus-ci is lowering free CI cycles - what to do with cfbot, etc?

From: Andres Freund <andres(at)anarazel(dot)de>
To: pgsql-hackers(at)postgresql(dot)org, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>
Subject: Cirrus-ci is lowering free CI cycles - what to do with cfbot, etc?
Date: 2023-08-08 02:15:41
Message-ID: 20230808021541.7lbzdefvma7qmn3w@awork3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

As some of you might have seen when running CI, cirrus-ci is restricting how
much CI cycles everyone can use for free (announcement at [1]). This takes
effect September 1st.

This obviously has consequences both for individual users of CI as well as
cfbot.

The first thing I think we should do is to lower the cost of CI. One thing I
had not entirely realized previously, is that macos CI is by far the most
expensive CI to provide. That's not just the case with cirrus-ci, but also
with other providers. See the series of patches described later in the email.

To me, the situation for cfbot is different than the one for individual
users.

IMO, for the individual user case it's important to use CI for "free", without
a whole lot of complexity. Which imo rules approaches like providing
$cloud_provider compute accounts, that's too much setup work. With the
improvements detailed below, cirrus' free CI would last about ~65 runs /
month.

For cfbot I hope we can find funding to pay for compute to use for CI. The, by
far, most expensive bit is macos. To a significant degree due to macos
licensing terms not allowing more than 2 VMs on a physical host :(.

The reason we chose cirrus-ci were

a) Ability to use full VMs, rather than a pre-selected set of VMs, which
allows us to test a larger number

b) Ability to link to log files, without requiring an account. E.g. github
actions doesn't allow to view logs unless logged in.

c) Amount of compute available.

The set of free CI providers has shrunk since we chose cirrus, as have the
"free" resources provided. I started, quite incomplete as of now, wiki page at
[4].

Potential paths forward for individual CI:

- migrate wholesale to another CI provider

- split CI tasks across different CI providers, rely on github et al
displaying the CI status for different platforms

- give up

Potential paths forward for cfbot, in addition to the above:

- Pay for compute / ask the various cloud providers to grant us compute
credits. At least some of the cloud providers can be used via cirrus-ci.

- Host (some) CI runners ourselves. Particularly with macos and windows, that
could provide significant savings.

- Build our own system, using buildbot, jenkins or whatnot.

Opinions as to what to do?

The attached series of patches:

1) Makes startup of macos instances faster, using more efficient caching of
the required packages. Also submitted as [2].

2) Introduces a template initdb that's reused during the tests. Also submitted
as [3]

3) Remove use of -DRANDOMIZE_ALLOCATED_MEMORY from macos tasks. It's
expensive. And CI also uses asan on linux, so I don't think it's really
needed.

4) Switch tasks to use debugoptimized builds. Previously many tasks used -Og,
to get decent backtraces etc. But the amount of CPU burned that way is too
large. One issue with that is that use of ccache becomes much more crucial,
uncached build times do significantly increase.

5) Move use of -Dsegsize_blocks=6 from macos to linux

Macos is expensive, -Dsegsize_blocks=6 slows things down. Alternatively we
could stop covering both meson and autoconf segsize_blocks. It does affect
runtime on linux as well.

6) Disable write cache flushes on windows

It's a bit ugly to do this without using the UI... Shaves off about 30s
from the tests.

7) pg_regress only checked once a second whether postgres started up, but it's
usually much faster. Use pg_ctl's logic. It might be worth replacing the
use psql with directly using libpq in pg_regress instead, looks like the
overhead of repeatedly starting psql is noticeable.

FWIW: with the patches applied, the "credit costs" in cirrus CI are roughly
like the following (depends on caching etc):

task costs in credits
linux-sanity: 0.01
linux-compiler-warnings: 0.05
linux-meson: 0.07
freebsd : 0.08
linux-autoconf: 0.09
windows : 0.18
macos : 0.28
total task runtime is 40.8
cost in credits is 0.76, monthly credits of 50 allow approx 66.10 runs/month

Greetings,

Andres Freund

[1] https://cirrus-ci.org/blog/2023/07/17/limiting-free-usage-of-cirrus-ci/
[2] https://www.postgresql.org/message-id/20230805202539.r3umyamsnctysdc7%40awork3.anarazel.de
[3] https://postgr.es/m/20220120021859.3zpsfqn4z7ob7afz@alap3.anarazel.de

Attachment Content-Type Size
v1-0001-ci-macos-used-cached-macports-install.patch text/x-diff 7.5 KB
v1-0002-Use-template-initdb-in-tests.patch text/x-diff 10.4 KB
v1-0003-ci-macos-Remove-use-of-DRANDOMIZE_ALLOCATED_MEMOR.patch text/x-diff 994 bytes
v1-0004-ci-switch-tasks-to-debugoptimized-build.patch text/x-diff 3.0 KB
v1-0005-ci-Move-use-of-Dsegsize_blocks-6-from-macos-to-li.patch text/x-diff 1.1 KB
v1-0006-ci-windows-Disabling-write-cache-flushing-during-.patch text/x-diff 1.5 KB
v1-0007-regress-Check-for-postgres-startup-completion-mor.patch text/x-diff 1.3 KB
v1-0008-ci-Don-t-specify-amount-of-memory.patch text/x-diff 1.3 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2023-08-08 02:21:08 Re: Cirrus-ci is lowering free CI cycles - what to do with cfbot, etc?
Previous Message Jonathan S. Katz 2023-08-08 02:03:44 Re: 2023-08-10 release announcement draft