From: | John Naylor <jcnaylor(at)gmail(dot)com> |
---|---|
To: | Joerg Sonnenberger <joerg(at)bec(dot)de> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: reducing the footprint of ScanKeyword (was Re: Large writable variables) |
Date: | 2019-01-04 20:20:07 |
Message-ID: | CAJVSVGVfe+-+NT0NY0XUtrb_h-r_0-54dyiy0SJfFLdYEZgK+A@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 1/3/19, Joerg Sonnenberger <joerg(at)bec(dot)de> wrote:
> Hello John,
> I was pointed at your patch on IRC and decided to look into adding my
> own pieces. What I can provide you is a fast perfect hash function
> generator. I've attached a sample hash function based on the current
> main keyword list. hash() essentially gives you the number of the only
> possible match, a final strcmp/memcmp is still necessary to verify that
> it is an actual keyword though. The |0x20 can be dropped if all cases
> have pre-lower-cased the input already. This would replace the binary
> search in the lookup functions. Returning offsets directly would be easy
> as well. That allows writing a single string where each entry is prefixed
> with a type mask, the token id, the length of the keyword and the actual
> keyword text. Does that sound useful to you?
Judging by previous responses, there is still interest in using
perfect hash functions, so thanks for this. I'm not knowledgeable
enough to judge its implementation, so I'll leave that for others.
-John Naylor
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2019-01-04 20:29:40 | Re: reducing the footprint of ScanKeyword (was Re: Large writable variables) |
Previous Message | Tom Lane | 2019-01-04 20:14:47 | Re: Arrays of domain returned to client as non-builtin oid describing the array, not the base array type's oid |