Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)

From: John Naylor <jcnaylor(at)gmail(dot)com>
To: Joerg Sonnenberger <joerg(at)bec(dot)de>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
Date: 2019-01-04 20:20:07
Message-ID: CAJVSVGVfe+-+NT0NY0XUtrb_h-r_0-54dyiy0SJfFLdYEZgK+A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 1/3/19, Joerg Sonnenberger <joerg(at)bec(dot)de> wrote:
> Hello John,
> I was pointed at your patch on IRC and decided to look into adding my
> own pieces. What I can provide you is a fast perfect hash function
> generator. I've attached a sample hash function based on the current
> main keyword list. hash() essentially gives you the number of the only
> possible match, a final strcmp/memcmp is still necessary to verify that
> it is an actual keyword though. The |0x20 can be dropped if all cases
> have pre-lower-cased the input already. This would replace the binary
> search in the lookup functions. Returning offsets directly would be easy
> as well. That allows writing a single string where each entry is prefixed
> with a type mask, the token id, the length of the keyword and the actual
> keyword text. Does that sound useful to you?

Judging by previous responses, there is still interest in using
perfect hash functions, so thanks for this. I'm not knowledgeable
enough to judge its implementation, so I'll leave that for others.

-John Naylor

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2019-01-04 20:29:40 Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
Previous Message Tom Lane 2019-01-04 20:14:47 Re: Arrays of domain returned to client as non-builtin oid describing the array, not the base array type's oid