Quick Links

Re: Heads Up: cirrus-ci is shutting down June 1st

From:	Jakub Wartak <jakub(dot)wartak(at)enterprisedb(dot)com>
To:	Andres Freund <andres(at)anarazel(dot)de>
Cc:	Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>, Jacob Champion <jacob(dot)champion(at)enterprisedb(dot)com>, Jelte Fennema-Nio <postgres(at)jeltef(dot)nl>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org, Zsolt Parragi <zsolt(dot)parragi(at)percona(dot)com>, Peter Eisentraut <peter(at)eisentraut(dot)org>
Subject:	Re: Heads Up: cirrus-ci is shutting down June 1st
Date:	2026-06-03 12:46:56
Message-ID:	CAKZiRmy+8t7W7R9r_J1PMjUcsYqSkuSSMvLESu1GEQgT29zZCw@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Hi,

On Tue, Jun 2, 2026 at 8:38 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
>
> Hi,
>
> On 2026-06-01 12:01:58 +0200, Jakub Wartak wrote:
> > So I've spent half of day on trying to see what makes the tests so slow at
> > least in my case. I can also confirm %CPU combined (with high 33% sys).
>
> Was this locally on your machine? I assume that's without enabling
> sanitizers?

Yup.

> In CI the bottleneck clearly is CPU at the moment, due to the relatively now
> number of cores.
>
> To reduce IO, one pretty significant thing we can do is to reduce the segment
> size used during tests. Creating lots of 16MB segments when most of them are
> only very partially used isn't free.

Right, saw that, nice.

> > 0. baseline was ~71s (stuff already hot)
> > 1a. down to 64s with dirtywriteback tune (and mostly to avoid NVMe/SSD wear)
> > 1b. ~65s with tmpfs, so I've left using dirtywriteback sysctls:
> > sudo mount -t tmpfs -o size=4G,uid=XXX,mode=755 tmpfs build/tmp_install
> > sudo mount -t tmpfs -o size=16G,uid=XXX,mode=755 tmpfs /build/testrun
>
> I don't think we should do that, real FS behaviour is something we do IMO want
> to test.

Ack.

> > 1,100 pg_upgrade
> > 896 isolation
> > 694 pg_dump
> > 682 pg_basebackup
> >
> > Fixing above subscription to ~5000 conns did not gain much (well it saved
> > 5% of runtime 43s -> 41s). It's literally 10k lines of
> > s/$node_subscriber->safe_psql/sub_bg->query_safe/g across dozens of files
> > in src/test/subscription/t/). Too big for review and I'm not sharing as
> > it could contain errors.
>
> Did you test the effect of those changes on windows (via CI)? I'd expect that
> big a reduction to have a substantially bigger effect there.

No I did not and I've wiped the changes already, It was just probe for
any simple
quick wins...

> > 5. Spotted that we do plenty of initdb and cached-initdb (cp), so I had idea
> > about XFS's cp reflinks=always in build/, but I couldn't do that without
> > /dev/loop, so apparently XFS (reflink=1) vs ext4(reflink=0) halves number
> > of writes while even still on /dev/loop device, but that somehow
> > does not directly contribute to duration of the test (well we are
> > bottlenecked on CPU anyway, so this is just smarter? way of avoiding I/O;
> > maybe with cold-caches and on real VMs running with XFS would be faster)
> >
> > +++ b/src/test/perl/PostgreSQL/Test/Cluster.pm
> > @@ -687,7 +687,13 @@ sub init
> > }
> > else
> > {
> > - @copycmd = qw(cp -RPp);
> > + @copycmd = qw(cp --reflink=always -RPp);
>
> Afaict cp uses reflinks automatically by default, if the filesystem supports
> it. On CI it's not supported due to ext4, but locally it seems to work for
> me.

Yeah it does, I was just wanted to be double-sure, but then realized with CI
we are on overlay fs on top of host's ext4 :( It's a pitty because that cp could
be instant (even CREATE DATABASE with file_extend_method=clone) as even with
--wal-segsize=1 empty cluster takes ~32MB (3x8MB), but even rough estimates
of even cached initdb calls give huge numbers:

$ grep -r -A 5 'PostgreSQL::Test::Cluster->new' src contrib | grep -Po
'\->init[a-z_]*' | sort | uniq -c
341 ->init
98 ->init_from_backup

so that's like 400 * 32MB = 12800 MB? But I get the point of using real fs,
it's just that we should have some option of using throwaway filesystems
(maybe we even do, but on own/dedicated runners).

-J.

In response to

Re: Heads Up: cirrus-ci is shutting down June 1st at 2026-06-02 18:38:37 from Andres Freund

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Andres Freund	2026-06-03 12:53:10	Re: Heads Up: cirrus-ci is shutting down June 1st
Previous Message	Fujii Masao	2026-06-03 12:03:39	Re: Fix race in ReplicationSlotRelease for ephemeral slots