From: | John Naylor <johncnaylorls(at)gmail(dot)com> |
---|---|
To: | David Rowley <dgrowleyml(at)gmail(dot)com> |
Cc: | PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Speed up JSON escape processing with SIMD plus other optimisations |
Date: | 2025-05-27 23:23:47 |
Message-ID: | CANWCAZZkngp7o-rcOsDLyNYv=3h82qO_H2HDUWinCGasTacxkw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, May 23, 2024 at 8:24 AM David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
> Other things I considered were if doing 16 bytes at a time is too much
> as it puts quite a bit of work into byte-at-a-time processing if just
> 1 special char exists in a 16-byte chunk. I considered doing SWAR [1]
> processing to do the job of vector8_has_le() and vector8_has() byte
> maybe with just uint32s. It might be worth doing that. However, I've
> not done it yet as it raises the bar for this patch quite a bit. SWAR
> vector processing is pretty much write-only code. Imagine trying to
> write comments for the code in [2] so that the average person could
> understand what's going on!?
Sorry to resurrect this thread, but I recently saw something that made
me think of this commit (as well as the similar one 0a8de93a48c):
I don't find this use of SWAR that bad for readability, and there's
only one obtuse clever part that merits a comment. Plus, it seems json
escapes are pretty much set in stone? I gave this a spin with
https://www.postgresql.org/message-id/attachment/163406/json_bench.sh.txt
master:
Test 1
tps = 321.522667 (without initial connection time)
tps = 315.070985 (without initial connection time)
tps = 331.070054 (without initial connection time)
Test 2
tps = 35.107257 (without initial connection time)
tps = 34.977670 (without initial connection time)
tps = 35.898471 (without initial connection time)
Test 3
tps = 33.575570 (without initial connection time)
tps = 32.383352 (without initial connection time)
tps = 31.876192 (without initial connection time)
Test 4
tps = 810.676116 (without initial connection time)
tps = 745.948518 (without initial connection time)
tps = 747.651923 (without initial connection time)
swar patch:
Test 1
tps = 291.919004 (without initial connection time)
tps = 294.446640 (without initial connection time)
tps = 307.670464 (without initial connection time)
Test 2
tps = 30.984440 (without initial connection time)
tps = 31.660630 (without initial connection time)
tps = 32.538174 (without initial connection time)
Test 3
tps = 29.828546 (without initial connection time)
tps = 30.332913 (without initial connection time)
tps = 28.873059 (without initial connection time)
Test 4
tps = 748.676688 (without initial connection time)
tps = 768.798734 (without initial connection time)
tps = 766.924632 (without initial connection time)
While noisy, this test seems a bit faster with SWAR, and it's more
portable to boot. I'm not sure where I'd put the new function so both
call sites can see it, but that's a small detail...
--
John Naylor
Amazon Web Services
Attachment | Content-Type | Size |
---|---|---|
v1-swar-json.patch | text/x-patch | 3.9 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | John Naylor | 2025-05-27 23:30:48 | Re: Review/Pull Request: Adding new CRC32C implementation for IBM S390X |
Previous Message | Tom Lane | 2025-05-27 23:21:05 | Re: Clarification on warning when connecting to 'pgbouncer' database via Pgbouncer |