Re: CI and test improvements

From: Andres Freund <andres(at)anarazel(dot)de>
To: Justin Pryzby <pryzby(at)telsasoft(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Noah Misch <noah(at)leadboat(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Anastasia Lubennikova <lubennikovaav(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Daniel Gustafsson <daniel(at)yesql(dot)se>
Subject: Re: CI and test improvements
Date: 2022-08-28 21:28:02
Message-ID: 20220828212802.r6eymfffrgr3lxxt@awork3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2022-08-28 12:10:29 -0500, Justin Pryzby wrote:
> On Sun, Aug 28, 2022 at 09:07:52AM -0700, Andres Freund wrote:
> > > --- /dev/null
> > > +++ b/src/tools/ci/windows-compiler-warnings
> >
> > Wouldn't that be doable as something like
> > sh -c 'if test -s file; then cat file;exit 1; fi"
> > inside .cirrus.yml?
>
> I had written it inline in a couple ways, like
> - sh -exc 'f=msbuild.warn.log; if [ -s "$f" ]; then cat "$f"; exit 1; else exit 0; fi'
>
> but then separated it out as you suggested in
> 20220227010908(dot)vz2a7dmfzgwg742w(at)alap3(dot)anarazel(dot)de
>
> after I complained about cmd.exe requiring escaping for && and ||
> That makes writing any shell script a bit perilous and a separate script
> seems better.

I remember that I suggested it - but note that the way I wrote above doesn't
have anything needing escaping. Anyway, what do you think of the multiline
split I suggested?

> > > Subject: [PATCH 03/25] cirrus/ccache: disable compression and show stats
> > >
> > > ccache since 4.0 enables zstd compression by default.
> > >
> > > With default compression enabled (https://cirrus-ci.com/task/6692342840164352):
> > > linux has 4.2; 99MB cirrus cache; cache_size_kibibyte 109616
> > > macos has 4.5.1: 47MB cirrus cache; cache_size_kibibyte 52500
> > > freebsd has 3.7.12: 42MB cirrus cache; cache_size_kibibyte 134064
> > > windows has 4.6.1; 180MB cirrus cache; cache_size_kibibyte 51179
> > > todo: compiler warnings
> > >
> > > With compression disabled (https://cirrus-ci.com/task/4614182514458624):
> > > linux: 91MB cirrus cache; cache_size_kibibyte 316136
> > > macos: 41MB cirrus cache; cache_size_kibibyte 118068
> > > windows: 42MB cirrus cache; cache_size_kibibyte 134064
> > > freebsd is the same
> > >
> > > The stats should either be shown or logged (or maybe run with CCACHE_NOSTATS,
> > > to avoid re-uploading cache tarball in a 100% cached build, due only to
> > > updating ./**/stats).
> > >
> > > Note that ccache 4.4 supports CCACHE_STATSLOG, which seems ideal.
> >
> > I stared at this commit message for a while, trying to make sense of it, and
> > couldn't really. I assume you're saying that the cirrus compression is better
> > with ccache compression disabled, but it's extremely hard to parse out of it.
>
> Yes, because ccache uses zstd-1, and cirrus uses gzip, which it's going
> to use no matter what ccache does, and gzip's default -6 is better than
> ccache's zstd-1.
>
> > This does too much at once. Show stats, change cache sizes, disable
> > compression.
>
> The cache size change is related to the compression level change; ccache
> prunes based on the local size, which was compressed with zstd-1 and,
> with this patch, not compressed (so ~2x larger). Also, it's more
> interesting to control the size uploaded to cirrus (after compression
> ith gzip-6).

That's what should have been in the commit message.

> > > From 6a6a97fc869fd1fd8b7ab5da5147f145581634f9 Mon Sep 17 00:00:00 2001
> > > From: Justin Pryzby <pryzbyj(at)telsasoft(dot)com>
> > > Date: Fri, 24 Jun 2022 00:09:12 -0500
> > > Subject: [PATCH 08/25] cirrus/freebsd: run with more CPUs+RAM and do not
> > > repartitiion
> > >
> > > There was some historic problem where tests under freebsd took 8+ minutes (and
> > > before 4a288a37f took 15 minutes).
> > >
> > > This reduces test time from 10min to 3min.
> > > 4 CPUs 4 tests https://cirrus-ci.com/task/4880240739614720
> > > 4 CPUs 6 tests https://cirrus-ci.com/task/4664440120410112 https://cirrus-ci.com/task/4586784884523008
> > > 4 CPUs 8 tests https://cirrus-ci.com/task/5001995491737600
> > >
> > > 6 CPUs https://cirrus-ci.com/task/6678321684545536
> > > 8 CPUs https://cirrus-ci.com/task/6264854121021440
> > >
> > > See also:
> > > https://www.postgresql.org/message-id/flat/20220310033347(dot)hgxk4pyarzq4hxwp(at)alap3(dot)anarazel(dot)de#f36c0b17e33e31e7925e7e5812998686
> > > 8 jobs 7min https://cirrus-ci.com/task/6186376667332608
> > >
> > > xi-os-only: freebsd
> >
> > Typo.
>
> No - it's deliberate so I can switch to and from "everything" to "this
> only".

I don't see the point in posting patches to be applied if they contain lots of
such things that a potential committer would need to catch and include a lot
of of fixup patches.

> > > @@ -71,8 +69,6 @@ task:
> > > fingerprint_key: ccache/freebsd
> > > reupload_on_changes: true
> > >
> > > - # Workaround around performance issues due to 32KB block size
> > > - repartition_script: src/tools/ci/gcp_freebsd_repartition.sh
> > > create_user_script: |
> > > pw useradd postgres
> > > chown -R postgres:postgres .
> > > --
> >
> > What's the story there - at some point that was important for performance
> > because of the native block size triggering significant read-modify-write
> > cycles with postres' writes. You didn't comment on it in the commit message.
>
> Well, I don't know the history, but it seems to be unneeded now.

It's possible it was mainly needed for testing with aio + dio. But also
possible that an upgrade improved the situation since.

> > > From fd1c36a0bd8fa608ccdff5be3735dac5e3e48bf3 Mon Sep 17 00:00:00 2001
> > > From: Justin Pryzby <pryzbyj(at)telsasoft(dot)com>
> > > Date: Wed, 27 Jul 2022 16:54:47 -0500
> > > Subject: [PATCH 09/25] cirrus/freebsd: run build+check in a make vpath
> >
> > > From 7052a32a21752b59632225684fc9426bb94e46e0 Mon Sep 17 00:00:00 2001
> > > From: Justin Pryzby <pryzbyj(at)telsasoft(dot)com>
> > > Date: Sun, 13 Feb 2022 17:56:40 -0600
> > > Subject: [PATCH 10/25] cirrus/windows: increase timeout to 25min
> >
> > No explanation?
>
> Because of the immediately following commit which makes it run all the
> tests.

Mention that in the commit message then. Especially when dealing with 25
commits I don't think you can expect others to infer such things.

> > > From 602983b2cf37fc43465c62330b2e15e9d6d2035d Mon Sep 17 00:00:00 2001
> > > From: Justin Pryzby <pryzbyj(at)telsasoft(dot)com>
> > > Date: Fri, 26 Aug 2022 12:00:10 -0500
> > > Subject: [PATCH 15/25] f!and chdir
> >
> > I don't see the point of pointing fixup commits to the list.
>
> It's a separate commit to make it easy to see the changes, separately,
> since I imagine maybe the "chdir" part won't be desirable, or maybe the
> PATH part won't. But I'm not sure, so I'm here soliciting feedback.

Shrug, I doubt you'll get much if asked that way.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2022-08-28 21:29:52 Re: add checkpoint stats of snapshot and mapping files of pg_logical dir
Previous Message Nathan Bossart 2022-08-28 21:14:21 Re: replacing role-level NOINHERIT with a grant-level option