Re: [POC] verifying UTF-8 using SIMD instructions

From: John Naylor <john(dot)naylor(at)enterprisedb(dot)com>
To: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [POC] verifying UTF-8 using SIMD instructions
Date: 2021-02-17 05:40:32
Message-ID: CAFBsxsGwaamEgiZGrr4YQPxdqjZSqNED+jxX3sHXSbjqSkD-0Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I wrote:

> [v3]
> - It's not smart enough to stop at the last valid character boundary --
it's either all-valid or it must start over with the fallback. That will
have to change in order to work with the proposed noError conversions. It
shouldn't be very hard, but needs thought as to the clearest and safest way
to code it.

In v4, it should be able to return an accurate count of valid bytes even
when the end crosses a character boundary.

> - This is my first time hacking autoconf, and it still seems slightly
broken, yet functional on my machine at least.

It was actually completely broken if you tried to pass the special flags to
configure. I redesigned this part and it seems to work now.

--
John Naylor
EDB: http://www.enterprisedb.com

Attachment Content-Type Size
v4-SSE4-with-autoconf-support.patch application/octet-stream 48.5 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2021-02-17 05:55:04 Re: progress reporting for partitioned REINDEX
Previous Message Takashi Menjo 2021-02-17 05:13:06 Re: [HACKERS][PATCH] Applying PMDK to WAL operations for persistent memory