Re: speed up verifying UTF-8

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: John Naylor <john(dot)naylor(at)enterprisedb(dot)com>
Cc: Greg Stark <stark(at)mit(dot)edu>, Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: speed up verifying UTF-8
Date: 2021-06-03 19:16:04
Message-ID: bca46396-a517-467c-72f8-6140a05a4d1e@iki.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 03/06/2021 22:10, John Naylor wrote:
> On Thu, Jun 3, 2021 at 3:08 PM Heikki Linnakangas <hlinnaka(at)iki(dot)fi
> <mailto:hlinnaka(at)iki(dot)fi>> wrote:
> >                 x1 = half1 + UINT64CONST(0x7f7f7f7f7f7f7f7f);
> >                 x2 = half2 + UINT64CONST(0x7f7f7f7f7f7f7f7f);
> >
> >                 /* then check that the high bit is set in each byte. */
> >                 x = (x1 | x2);
> >                 x &= UINT64CONST(0x8080808080808080);
> >                 if (x != UINT64CONST(0x8080808080808080))
> >                         return 0;
>
> That seems right, I'll try that and update the patch. (Forgot to attach
> earlier anyway)

Ugh, actually that has the same issue as before. If one of the bytes is
in one half is zero, but not in the other half, this fail to detect it.
Sorry for the noise..

- Heikki

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Christensen 2021-06-03 19:17:53 Re: [PATCH] expand the units that pg_size_pretty supports on output
Previous Message Mark Dilger 2021-06-03 19:11:25 Re: security_definer_search_path GUC