Quick Links

Re: [POC] verifying UTF-8 using SIMD instructions

From:	John Naylor <john(dot)naylor(at)enterprisedb(dot)com>
To:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc:	pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: [POC] verifying UTF-8 using SIMD instructions
Date:	2021-02-07 20:24:16
Message-ID:	CAFBsxsHWAy+GS39rEbsczLb-3H1=P_93urv-85K0R7dUQfajwQ@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Here is a more polished version of the function pointer approach, now
adapted to all multibyte encodings. Using the not-yet-committed tests from
[1], I found a thinko bug that resulted in the test for nul bytes to not
only be wrong, but probably also elided by the compiler. Doing it correctly
is noticeably slower on pure ascii, but still several times faster than
before, so the conclusions haven't changed any. I'll run full measurements
later this week, but I'll share the patch now for review.

[1]
https://www.postgresql.org/message-id/11d39e63-b80a-5f8d-8043-fff04201fadc@iki.fi

--
John Naylor
EDB: http://www.enterprisedb.com

Attachment	Content-Type	Size
v1-0001-Add-an-ASCII-fast-path-to-multibyte-encoding-veri.patch	application/octet-stream	7.6 KB

In response to

Re: [POC] verifying UTF-8 using SIMD instructions at 2021-02-04 21:48:35 from John Naylor

Responses

Re: [POC] verifying UTF-8 using SIMD instructions at 2021-02-08 10:17:11 from Heikki Linnakangas

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Alvaro Herrera	2021-02-07 21:11:16	Re: [HACKERS] GSoC 2017: Foreign Key Arrays
Previous Message	David G. Johnston	2021-02-07 19:09:42	Re: jsonb_array_elements_recursive()