Re: gs_group_1 crashing on 13beta2/s390x

From: Andres Freund <andres(at)anarazel(dot)de>
To: Christoph Berg <myon(at)debian(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: gs_group_1 crashing on 13beta2/s390x
Date: 2020-10-15 08:32:46
Message-ID: 20201015083246.kie5726xerdt3ael@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2020-10-14 17:56:16 -0700, Andres Freund wrote:
> Oh dear. It's not as simple as that. The issue indeed are relocations,
> but we don't hit those errors. The issue rather is that the systemz
> specific relative redirection code thought that the only relative
> symbols are functions. So it creates a stub function to redirect
> them. Which turns out to not work well with variables like
> CurrentMemoryContext...

That might be a problem - but the main problem causing the crash at hand
is likely something else. The prototypes we create for
ExecAggTransReparent() were missing the 'zeroext' parameter for a the
'isnull' attribute, because the code for copying the attributes from
llvmjit_types.bc didn't go deep enough (i.e. I didn't quite grok the
pretty weird API). On s390x that lead to the newValue argument in
ExecAggTransReparent() having a 0 lower byte, but set higher bytes -
which then *sometimes* fooled the if (!newValueIsNull) check, which
assumed that the higher bits were unset.

I have a fix for this, but I've just stared at s390 assembly code for
~10h, never having done so before. So that'll have to wait for tomorrow.

It's quite possible that that fix would also help on other
architectures...

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2020-10-15 08:37:35 Re: recovering from "found xmin ... from before relfrozenxid ..."
Previous Message Kyotaro Horiguchi 2020-10-15 08:32:10 Re: Wrong statistics for size of XLOG_SWITCH during pg_waldump.