Re: Speed up JSON escape processing with SIMD plus other optimisations

From: John Naylor <johncnaylorls(at)gmail(dot)com>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Speed up JSON escape processing with SIMD plus other optimisations
Date: 2025-05-27 23:23:47
Message-ID: CANWCAZZkngp7o-rcOsDLyNYv=3h82qO_H2HDUWinCGasTacxkw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, May 23, 2024 at 8:24 AM David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
> Other things I considered were if doing 16 bytes at a time is too much
> as it puts quite a bit of work into byte-at-a-time processing if just
> 1 special char exists in a 16-byte chunk. I considered doing SWAR [1]
> processing to do the job of vector8_has_le() and vector8_has() byte
> maybe with just uint32s. It might be worth doing that. However, I've
> not done it yet as it raises the bar for this patch quite a bit. SWAR
> vector processing is pretty much write-only code. Imagine trying to
> write comments for the code in [2] so that the average person could
> understand what's going on!?

Sorry to resurrect this thread, but I recently saw something that made
me think of this commit (as well as the similar one 0a8de93a48c):

https://lemire.me/blog/2025/04/13/detect-control-characters-quotes-and-backslashes-efficiently-using-swar/

I don't find this use of SWAR that bad for readability, and there's
only one obtuse clever part that merits a comment. Plus, it seems json
escapes are pretty much set in stone? I gave this a spin with

https://www.postgresql.org/message-id/attachment/163406/json_bench.sh.txt

master:

Test 1
tps = 321.522667 (without initial connection time)
tps = 315.070985 (without initial connection time)
tps = 331.070054 (without initial connection time)
Test 2
tps = 35.107257 (without initial connection time)
tps = 34.977670 (without initial connection time)
tps = 35.898471 (without initial connection time)
Test 3
tps = 33.575570 (without initial connection time)
tps = 32.383352 (without initial connection time)
tps = 31.876192 (without initial connection time)
Test 4
tps = 810.676116 (without initial connection time)
tps = 745.948518 (without initial connection time)
tps = 747.651923 (without initial connection time)

swar patch:

Test 1
tps = 291.919004 (without initial connection time)
tps = 294.446640 (without initial connection time)
tps = 307.670464 (without initial connection time)
Test 2
tps = 30.984440 (without initial connection time)
tps = 31.660630 (without initial connection time)
tps = 32.538174 (without initial connection time)
Test 3
tps = 29.828546 (without initial connection time)
tps = 30.332913 (without initial connection time)
tps = 28.873059 (without initial connection time)
Test 4
tps = 748.676688 (without initial connection time)
tps = 768.798734 (without initial connection time)
tps = 766.924632 (without initial connection time)

While noisy, this test seems a bit faster with SWAR, and it's more
portable to boot. I'm not sure where I'd put the new function so both
call sites can see it, but that's a small detail...

--
John Naylor
Amazon Web Services

Attachment Content-Type Size
v1-swar-json.patch text/x-patch 3.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message John Naylor 2025-05-27 23:30:48 Re: Review/Pull Request: Adding new CRC32C implementation for IBM S390X
Previous Message Tom Lane 2025-05-27 23:21:05 Re: Clarification on warning when connecting to 'pgbouncer' database via Pgbouncer