Re: speed up unicode normalization quick check

From: Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>
To: John Naylor <john(dot)naylor(at)2ndquadrant(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: speed up unicode normalization quick check
Date: 2020-09-19 23:09:27
Message-ID: 74CB1F08-C551-4D9B-B37B-17CA39E58AF1@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On Sep 19, 2020, at 3:58 PM, John Naylor <john(dot)naylor(at)2ndquadrant(dot)com> wrote:
>
> On Sat, Sep 19, 2020 at 1:46 PM Mark Dilger
> <mark(dot)dilger(at)enterprisedb(dot)com> wrote:
>
>> 0002 and 0003 look good to me. I like the way you cleaned up a bit with the unicode_norm_props struct, which makes the code a bit more tidy, and on my compiler under -O2 it does not generate any extra runtime dereferences, as the compiler can see through the struct just fine. My only concern would be if some other compilers might not see through the struct, resulting in a runtime performance cost? I wouldn't even ask, except that qc_hash_lookup is called in a fairly tight loop.
>
> (I assume you mean unicode_norm_info) Yeah, that usage was copied from
> the keyword list code. I believe it was done for the convenience of
> the callers. That is worth something, and so is consistency. That
> said, I'd be curious if there is a measurable impact for some
> platforms.

Right, unicode_norm_info. I'm not sure the convenience of the callers matters here, since the usage is restricted to just one file, but I also don't have a problem with the code as you have it.

>> In your commit message for 0001 you mentioned testing on a multiplicity of compilers. Did you do that for 0002 and 0003 as well?
>
> For that, I was simply using godbolt.org to test compiling the
> multiplications down to shift-and-adds. Very widespread, I only
> remember MSVC as not doing it. I'm not sure a few extra cycles would
> have been noticeable here, but it can't hurt to have that guarantee.

I am marking this ready for committer. I didn't object to the whitespace weirdness in your patch (about which `git apply` grumbles) since you seem to have done that intentionally. I have no further comments on the performance issue, since I don't have any other platforms at hand to test it on. Whichever committer picks this up can decide if the issue matters to them enough to punt it back for further performance testing.


Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nikita Glukhov 2020-09-19 23:23:21 Fix inconsistency in jsonpath .datetime()
Previous Message John Naylor 2020-09-19 22:58:11 Re: speed up unicode normalization quick check