pgsql: Use perfect hashing, instead of binary search, for keyword looku

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-committers(at)lists(dot)postgresql(dot)org
Subject: pgsql: Use perfect hashing, instead of binary search, for keyword looku
Date: 2019-01-10 00:48:01
Message-ID: E1ghOVt-0007os-2V@gemulon.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers pgsql-hackers

Use perfect hashing, instead of binary search, for keyword lookup.

We've been speculating for a long time that hash-based keyword lookup
ought to be faster than binary search, but up to now we hadn't found
a suitable tool for generating the hash function. Joerg Sonnenberger
provided the inspiration, and sample code, to show us that rolling our
own generator wasn't a ridiculous idea. Hence, do that.

The method used here requires a lookup table of approximately 4 bytes
per keyword, but that's less than what we saved in the predecessor commit
afb0d0712, so it's not a big problem. The time savings is indeed
significant: preliminary testing suggests that the total time for raw
parsing (flex + bison phases) drops by ~20%.

Patch by me, but it owes its existence to Joerg Sonnenberger;
thanks also to John Naylor for review.

Discussion: https://postgr.es/m/20190103163340.GA15803@britannica.bec.de

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/c64d0cd5ce24a344798534f1bc5827a9199b7a6e

Modified Files
--------------
src/common/Makefile | 9 +-
src/common/kwlookup.c | 73 +++---
src/include/common/kwlookup.h | 4 +
src/include/parser/kwlist.h | 3 +-
src/interfaces/ecpg/preproc/Makefile | 13 +-
src/interfaces/ecpg/preproc/c_keywords.c | 51 ++--
src/interfaces/ecpg/preproc/c_kwlist.h | 3 +-
src/interfaces/ecpg/preproc/ecpg_kwlist.h | 3 +-
src/pl/plpgsql/src/Makefile | 13 +-
src/pl/plpgsql/src/pl_reserved_kwlist.h | 5 +-
src/pl/plpgsql/src/pl_unreserved_kwlist.h | 7 +-
src/tools/PerfectHash.pm | 376 ++++++++++++++++++++++++++++++
src/tools/gen_keywordlist.pl | 53 ++++-
src/tools/msvc/Solution.pm | 10 +-
14 files changed, 516 insertions(+), 107 deletions(-)

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message Robert Haas 2019-01-10 02:11:20 Re: pgsql: Use perfect hashing, instead of binary search, for keyword looku
Previous Message Michael Paquier 2019-01-10 00:39:34 pgsql: Fix grammar mistakes in md.c

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2019-01-10 01:01:40 Re: Misleading panic message in backend/access/transam/xlog.c
Previous Message Michael Paquier 2019-01-10 00:47:52 Re: Making WAL receiver startup rely on GUC context for primary_conninfo and primary_slot_name