Re: [POC] verifying UTF-8 using SIMD instructions

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: John Naylor <john(dot)naylor(at)enterprisedb(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [POC] verifying UTF-8 using SIMD instructions
Date: 2021-02-09 20:22:02
Message-ID: 8b5d6e4b-2478-38d1-8b3e-ce5132e3ce4c@iki.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 09/02/2021 22:08, John Naylor wrote:
> Maybe there's a smarter way to check for zeros in C. Or maybe be more
> careful about cache -- running memchr() on the whole input first might
> not be the best thing to do.

The usual trick is the haszero() macro here:
https://graphics.stanford.edu/~seander/bithacks.html#ZeroInWord. That's
how memchr() is typically implemented, too.

- Heikki

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2021-02-09 20:28:26 Re: Is txid_status() actually safe? / What is 011_crash_recovery.pl testing?
Previous Message Robert Haas 2021-02-09 20:12:16 Re: [HACKERS] Custom compression methods