Re: Intermittent buildfarm failures on wrasse

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Noah Misch <noah(at)leadboat(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, David Rowley <dgrowleyml(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Subject: Re: Intermittent buildfarm failures on wrasse
Date: 2022-04-20 04:55:48
Message-ID: CAD21AoAsuDJE2u3_yFKr7J3jdim-iZg2oYdUzKZCpkhPJHsxFw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Apr 20, 2022 at 3:29 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> writes:
> > On 2022-Apr-15, Tom Lane wrote:
> >> Here's a WIP patch for that. The only exciting thing in it is that
> >> because of some undocumented cowboy programming in walsender.c, the
> >> Assert((proc->statusFlags & (~PROC_COPYABLE_FLAGS)) == 0);
> >> in ProcArrayInstallRestoredXmin fires unless we skip that.
>
> > Hmm, maybe a better use of that define is to use to select which flags
> > to copy, rather than to ensure we they are the only ones set. What
> > about this?
>
> Yeah, I thought about that too, but figured that the author probably
> had a reason for writing the assertion the way it was.

The motivation behind the assertion was that when we copy whole
statusFlags from the leader process to the worker process we want to
make sure that the flags we're copying is a known subset of the flags
that are valid to copy from the leader.

> If we want
> to redefine PROC_COPYABLE_FLAGS as "flags associated with xmin",
> that's fine by me. But I'd suggest that both the name of the macro
> and the comment for it in proc.h should be revised to match that
> definition.
>
> Another point is that as you've coded it, the code doesn't so much
> copy those flags as union them with whatever the recipient had,
> which seems wrong. I could go with either
>
> Assert(!(MyProc->statusFlags & PROC_XMIN_FLAGS));
> MyProc->statusFlags |= (proc->statusFlags & PROC_XMIN_FLAGS);
>
> or
>
> MyProc->statusFlags = (MyProc->statusFlags & ~PROC_XMIN_FLAGS) |
> (proc->statusFlags & PROC_XMIN_FLAGS);
>
> Perhaps the latter is more future-proof.

Copying only xmin-related flags in this way also makes sense to me and
there is no problem at least for now. A note would be that when we
introduce a new flag that needs to be copied in the future, we need to
make sure to add it to PROC_XMIN_FLAGS so it is copied. Otherwise a
similar issue we fixed by 0f0cfb494004befb0f6e could happen again.

Regards,

--
Masahiko Sawada
EDB: https://www.enterprisedb.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Smith 2022-04-20 05:49:24 Re: Handle infinite recursion in logical replication setup
Previous Message Kyotaro Horiguchi 2022-04-20 04:54:59 Re: TRAP: FailedAssertion("tabstat->trans == trans", File: "pgstat_relation.c", Line: 508