From: | John Naylor <john(dot)naylor(at)enterprisedb(dot)com> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jelte Fennema <Jelte(dot)Fennema(at)microsoft(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net> |
Subject: | Re: [PATCH] Optimize json_lex_string by batching character copying |
Date: | 2022-08-15 13:33:21 |
Message-ID: | CAFBsxsESLUyJ5spfOSyPrOvKUEYYNqsBosue9SV1j8ecgNXSKA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
I wrote
> On Mon, Jul 11, 2022 at 11:07 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
>
> > I wonder if we can add a somewhat more general function for scanning until
> > some characters are found using SIMD? There's plenty other places that could
> > be useful.
>
> In simple cases, we could possibly abstract the entire loop. With this particular case, I imagine the most approachable way to write the loop would be a bit more low-level:
>
> while (p < end - VECTOR_WIDTH &&
> !vector_has_byte(p, '\\') &&
> !vector_has_byte(p, '"') &&
> vector_min_byte(p, 0x20))
> p += VECTOR_WIDTH
>
> I wonder if we'd lose a bit of efficiency here by not accumulating set bits from the three conditions, but it's worth trying.
The attached implements the above, more or less, using new pg_lfind8()
and pg_lfind8_le(), which in turn are based on helper functions that
act on a single vector. The pg_lfind* functions have regression tests,
but I haven't done the same for json yet. I went the extra step to use
bit-twiddling for non-SSE builds using uint64 as a "vector", which
still gives a pretty good boost (test below, min of 3):
master:
356ms
v5:
259ms
v5 disable SSE:
288ms
It still needs a bit of polishing and testing, but I think it's a good
workout for abstracting SIMD out of the way.
-------------
test:
DROP TABLE IF EXISTS long_json_as_text;
CREATE TABLE long_json_as_text AS
with long as (
select repeat(description, 11)
from pg_description
)
select (select json_agg(row_to_json(long))::text as t from long) from
generate_series(1, 100);
VACUUM FREEZE long_json_as_text;
select 1 from long_json_as_text where t::json is null; -- from Andrew upthread
--
John Naylor
EDB: http://www.enterprisedb.com
Attachment | Content-Type | Size |
---|---|---|
v5-json-lex-string-simd-ops.patch | text/x-patch | 11.4 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | John Naylor | 2022-08-15 13:39:27 | Re: [PoC] Improve dead tuple storage for lazy vacuum |
Previous Message | Damir Belyalov | 2022-08-15 13:23:26 | Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features) |