Re: speed up unicode normalization quick check

From: John Naylor <john(dot)naylor(at)2ndquadrant(dot)com>
To: Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: speed up unicode normalization quick check
Date: 2020-05-29 03:54:39
Message-ID: CACPNZCvUBmKSivCGAjh-sERQ9bAigB14cPZr5Zc4Do_ryd5Ezg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, May 29, 2020 at 5:59 AM Mark Dilger
<mark(dot)dilger(at)enterprisedb(dot)com> wrote:
>
> > On May 21, 2020, at 12:12 AM, John Naylor <john(dot)naylor(at)2ndquadrant(dot)com> wrote:

> > very picky in general. As a test, it also successfully finds a
> > function for the OS "words" file, the "D" sets of codepoints, and for
> > sets of the first n built-in OIDs, where n > 5.
>
> Prior to this patch, src/tools/gen_keywordlist.pl is the only script that uses PerfectHash. Your patch adds a second. I'm not convinced that modifying the PerfectHash code directly each time a new caller needs different multipliers is the right way to go.

Calling it "each time" with a sample size of two is a bit of a
stretch. The first implementation made a reasonable attempt to suit
future uses and I simply made it a bit more robust. In the text quoted
above you can see I tested some scenarios beyond the current use
cases, with key set sizes as low as 6 and as high as 250k.

> Could you instead make them arguments such that gen_keywordlist.pl, generate-unicode_combining_table.pl, and future callers can pass in the numbers they want? Or is there some advantage to having it this way?

That is an implementation detail that callers have no business knowing about.

--
John Naylor https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2020-05-29 05:16:32 Speeding up parts of the planner using a binary search tree structure for nodes
Previous Message John Bachir 2020-05-29 03:24:40 feature idea: use index when checking for NULLs before SET NOT NULL