Re: wrong query results on bf leafhopper

From: Robins Tharakan <tharakan(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: David Rowley <dgrowleyml(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, "Tharakan, Robins" <tharar(at)amazon(dot)com>
Subject: Re: wrong query results on bf leafhopper
Date: 2025-06-03 01:15:51
Message-ID: CAEP4nAwhtsZYFfzLGiq-tHJaEFw55TpnrKOxzMU1R+HsL3wjEg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On Thu, 29 May 2025 at 02:32, Andres Freund <andres(at)anarazel(dot)de> wrote:

> On 2025-05-28 22:51:14 +0930, Robins Tharakan wrote:

> Recently leafhopper failed again on the same test. For now I've paused it.
> > To rule out the compiler (and its maturity on the architecture), I'll
> > upgrade
> > gcc (to nightly, or something more recent) and then re-enable to see if
> it
> > changes anything.
>
> +1 to a gcc upgrade, gcc 11 is rather old and out of upstream support.

Ack. I've updated leafhopper to gcc master. For now (to get the machine
green / running), I've disabled some flags, which I'll revisit in some time,
but hopefully that's not about compiler maturity - which is what I'm after
here.

> A kernel upgrade would be good too. My completely baseless gut feeling is
> that

some SIMD registers occassionally get corrupted, e.g. due to a kernel
> interrupt / context switch not properly storing & restoring them. Weirdly
> enought the instrumentation code is among the pieces of PG code most
> vulnerable to that because we mostly don't do enough auto-vectorizable
> math,
> but InstrEndLoop(), InstrStopNode() etc are trivially auto-vectorizable.
> I'm
> pretty sure I've previously analyzed problems around this, but don't
> remember
> the details (IA64 maybe?).
>

Fair point, I'll keep that option open. Originally, the machine was spun up
to
evaluate the graviton4 ec2 instance and I'd like to explore whether the
stock-kernel / kernel-updates are able to keep the instance green (and
resort
to updating the kernel only if I exhaust all other options - pg / compiler
etc.).

-
robins

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2025-06-03 01:19:20 Re: pg18: Virtual generated columns are not (yet) safe when superuser selects from them
Previous Message Fujii Masao 2025-06-03 01:05:55 Re: Add “FOR UPDATE NOWAIT” lock details to the log.