Re: [PATCH] Optimize json_lex_string by batching character copying

From: John Naylor <john(dot)naylor(at)enterprisedb(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jelte Fennema <Jelte(dot)Fennema(at)microsoft(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>
Subject: Re: [PATCH] Optimize json_lex_string by batching character copying
Date: 2022-07-12 06:57:48
Message-ID: CAFBsxsGzaaGLF=Nuq61iRXTyspbO9rOjhSqFN=V6ozzmta5mXg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jul 11, 2022 at 11:07 PM Andres Freund <andres(at)anarazel(dot)de> wrote:

> I wonder if we can add a somewhat more general function for scanning until
> some characters are found using SIMD? There's plenty other places that
could
> be useful.

In simple cases, we could possibly abstract the entire loop. With this
particular case, I imagine the most approachable way to write the loop
would be a bit more low-level:

while (p < end - VECTOR_WIDTH &&
!vector_has_byte(p, '\\') &&
!vector_has_byte(p, '"') &&
vector_min_byte(p, 0x20))
p += VECTOR_WIDTH

I wonder if we'd lose a bit of efficiency here by not accumulating set bits
from the three conditions, but it's worth trying.
--
John Naylor
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Richard Guo 2022-07-12 07:20:37 Re: Making Vars outer-join aware
Previous Message Fujii.Yuki@df.MitsubishiElectric.co.jp 2022-07-12 06:49:16 RE: WIP: Aggregation push-down - take2