Re: use SSE2 for is_valid_ascii

From: John Naylor <john(dot)naylor(at)enterprisedb(dot)com>
To: Nathan Bossart <nathandbossart(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Andres Freund <andres(at)anarazel(dot)de>, Jelte Fennema <me(at)jeltef(dot)nl>
Subject: Re: use SSE2 for is_valid_ascii
Date: 2022-08-11 04:10:34
Message-ID: CAFBsxsFXym2h5LZiHUCP=WQzvDMeSgp1+A3UoBb0jtDdHTsWtQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Aug 11, 2022 at 5:31 AM Nathan Bossart <nathandbossart(at)gmail(dot)com> wrote:
>
> This is a neat patch. I don't know that we need an entirely separate code
> block for the USE_SSE2 path, but I do think that a little bit of extra
> commentary would improve the readability. IMO the existing comment for the
> zero accumulator has the right amount of detail.
>
> + /*
> + * Set all bits in each lane of the error accumulator where input
> + * bytes are zero.
> + */
> + error_cum = _mm_or_si128(error_cum,
> + _mm_cmpeq_epi8(chunk, _mm_setzero_si128()));

Okay, I will think about the comments, thanks for looking.

> I wonder if reusing a zero vector (instead of creating a new one every
> time) has any noticeable effect on performance.

Creating a zeroed register is just FOO PXOR FOO, which should get
hoisted out of the (unrolled in this case) loop, and which a recent
CPU will just map to a hard-coded zero in the register file, in which
case the execution latency is 0 cycles. :-)

--
John Naylor
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bharath Rupireddy 2022-08-11 04:25:13 Re: Refactor backup related code (was: Is it correct to say, "invalid data in file \"%s\"", BACKUP_LABEL_FILE in do_pg_backup_stop?)
Previous Message Andres Freund 2022-08-11 04:07:48 Re: [RFC] building postgres with meson