Re: speed up verifying UTF-8

From: John Naylor <john(dot)naylor(at)enterprisedb(dot)com>
To: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc: Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: speed up verifying UTF-8
Date: 2021-06-11 00:36:14
Message-ID: CAFBsxsGybY_LUQPzzGX6uZv=V=bFgxAtUhHqpU71qm+h7i218w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I wrote:

> Also, if SSE is accepted into the tree, then the C fallback is only
important on platforms like PowerPC64 and Arm64, so we can make the
tradeoff by testing those more carefully. I'll test on PowerPC soon.

I got around to testing on POWER8 / Linux / gcc 4.8.5 and found a
regression in the mixed2 case in v11. v12 improves that at the cost of some
improvement in the ascii case (5x vs. 8x).

master:
chinese | mixed | ascii | mixed2
---------+-------+-------+--------
2966 | 1525 | 871 | 1474

v11-0001:
chinese | mixed | ascii | mixed2
---------+-------+-------+--------
1030 | 644 | 102 | 1760

v12-0001:
chinese | mixed | ascii | mixed2
---------+-------+-------+--------
977 | 632 | 168 | 1113

--
John Naylor
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2021-06-11 00:39:19 Re: doc issue missing type name "multirange" in chapter title
Previous Message Justin Pryzby 2021-06-11 00:28:21 Re: doc issue missing type name "multirange" in chapter title