Re: Intermittent buildfarm failures on wrasse

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org, Noah Misch <noah(at)leadboat(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, David Rowley <dgrowleyml(at)gmail(dot)com>
Subject: Re: Intermittent buildfarm failures on wrasse
Date: 2022-04-15 16:58:50
Message-ID: 20220415165850.rh7byj7ep5jmzuzb@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

(Sent again, somehow my editor started to sometimes screw up mail
headers, and ate the From:, sorry for the duplicate)

On 2022-04-15 12:36:52 -0400, Tom Lane wrote:
> Andres Freund <andres(at)anarazel(dot)de> writes:
> > On April 15, 2022 11:23:40 AM EDT, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >> The something is the logical replication launcher. In the failing runs,
> >> it is advertising xmin = 724 (the post-initdb NextXID) and continues to
> >> do so well past the point where tenk1 gets vacuumed.
>
> > That explains it. Before shmstat autovac needed to wait for the stats collector to write out stats. Now it's near instantaneous. So the issue probably existed before, just unlikely to ever be reached.
>
> Um, this is the logical replication launcher, not the autovac
> launcher.

Short term confusion...

> Your observation that a sleep in get_database_list() reproduces it
> confirms that

I don't understand what you mean here? get_database_list() is autovac
launcher code? So being able to reproduce the problem by putting in a
sleep there doesn't seem like a confirm anything about the logical rep
launcher?

> , and I don't entirely see why the timing of the LR launcher
> would have changed.

Could still be related to the autovac launcher not requesting / pgstats
not writing / launcher not reading the stats file(s). That obviously is
going to have some scheduler impact.

> > We can't just ignore database less xmins for non-shared rels, because walsender propagates hot_standby_feedback that way. But we can probably add a flag somewhere indicating whether a database less PGPROC has to be accounted in the horizon for non-shared rels.
>
> Yeah, I was also thinking about a flag in PGPROC being a more reliable
> way to do this. Is there anything besides walsenders that should set
> that flag?

Not that I can think of. It's only because of hs_feedback that we need
to. I guess it's possible that somebody build some extension that needs
something similar, but then they'd need to set that flag...

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2022-04-15 17:05:29 Re: Intermittent buildfarm failures on wrasse
Previous Message Andres Freund 2022-04-15 16:57:01 Re: Intermittent buildfarm failures on wrasse