Re: [PATCH] Hex-coding optimizations using SVE on ARM.

From: John Naylor <johncnaylorls(at)gmail(dot)com>
To: Nathan Bossart <nathandbossart(at)gmail(dot)com>
Cc: "Chiranmoy(dot)Bhattacharya(at)fujitsu(dot)com" <Chiranmoy(dot)Bhattacharya(at)fujitsu(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, David Rowley <dgrowleyml(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, "Devanga(dot)Susmitha(at)fujitsu(dot)com" <Devanga(dot)Susmitha(at)fujitsu(dot)com>, "Ragesh(dot)Hajela(at)fujitsu(dot)com" <Ragesh(dot)Hajela(at)fujitsu(dot)com>
Subject: Re: [PATCH] Hex-coding optimizations using SVE on ARM.
Date: 2025-09-29 08:45:27
Message-ID: CANWCAZZMaxfO3CLr1_ViUZMknp_M9iJ0LJbw63E4xMRanGiSqg@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Sep 26, 2025 at 1:50 AM Nathan Bossart <nathandbossart(at)gmail(dot)com> wrote:
>
> On Thu, Sep 25, 2025 at 09:16:35PM +0700, John Naylor wrote:
> > (Separately, now I'm wondering if we can do the same for
> > vector8_has_le since _mm_min_epu8 and vminvq_u8 both exist, and that
> > would allow getting rid of )
>
> I think so. I doubt there's any performance advantage, but it could be
> nice for code cleanup. (I'm assuming you meant to say vector8_ssub
> (renamed to vector8_ussub() in the patch) after "getting rid of.")

Yes right, sorry. And it seems good to do such cleanup first, since it
doesn't make sense to rename something that is about to be deleted.

> I think I disagree on this one. We're not broadcasting arbitrary bytes for
> every vector element, we're broadcasting a patten of bytes that happens to
> be wider than the element size. I would expect this to be a relatively
> common use-case.

That's probably true. I'm still worried that the hack for working
around compiler pickiness (while nice enough in it's current form)
might break at some point and require awareness of compiler versions.

Hmm, for this case, we can sidestep the maintainability questions
entirely by instead using the new interleave functions to build the
masks:

vector8_interleave_low(vector8_zero(), vector8_broadcast(0x0f))
vector8_interleave_low(vector8_broadcast(0x0f), vector8_zero())

This generates identical code as v12 on Arm and is not bad on x86.
What do you think of the attached?

While looking around again, it looks like the "msk" variable isn't a
mask like the implies to me. Not sure of a better name because I'm not
sure what it represents aside from a temp variable.

+#elif defined(USE_NEON)
+ switch (i)
+ {
+ case 4:
+ return (Vector8) vshrq_n_u32((Vector32) v1, 4);
+ case 8:
+ return (Vector8) vshrq_n_u32((Vector32) v1, 8);
+ default:
+ pg_unreachable();
+ return vector8_broadcast(0);
+ }

This is just a compiler hint, so if the input is not handled I think
it will return the wrong answer rather than alerting the developer, so
we probabaly want "Assert(false)" here.

Other than that, the pack/unpack functions could use some
documentation about which parameter is low/high.

--
John Naylor
Amazon Web Services

Attachment Content-Type Size
interleave-mask.patch.txt text/plain 2.7 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Richard Guo 2025-09-29 08:52:18 Re: plan shape work
Previous Message Chao Li 2025-09-29 08:19:48 Re: GB18030-2022 Support in PostgreSQL