Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, John Naylor <jcnaylor(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
Date: 2018-12-26 15:45:11
Message-ID: CA+TgmoamFb6tT7DgyatrsKOtN-XqBHaCELo4teKnfFm0FGx8ug@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Dec 19, 2018 at 8:01 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> The last time I looked into perfect hash functions, it wasn't easy to
> find a generator that competed with a decent normal hashtable (in
> particular gperf's are very unconvincing). The added tooling is a
> concern imo. OTOH, we're comparing not with a hashtable, but a binary
> search, where the latter will usually loose. Wonder if we shouldn't
> generate a serialized non-perfect hashtable instead. The lookup code for
> a read-only hashtable without concern for adversarial input is pretty
> trivial.

I wonder if we could do something really simple like a lookup based on
the first character of the scan keyword. It looks to me like there are
440 keywords right now, and the most common starting letter is 'c',
which is the first letter of 51 keywords. So dispatching based on the
first letter clips at least 3 steps off the binary search. I don't
know whether that's enough to be worthwhile, but it's probably pretty
simple to implement.

I'm not sure that I understand quite what you have in mind for a
serialized non-perfect hashtable. Are you thinking that we'd just
construct a simplehash and serialize it?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2018-12-26 16:22:39 Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
Previous Message David Steele 2018-12-26 14:45:55 Re: Change pgarch_readyXlog() to return .history files first