Re: Bison state table

From: John Naylor <john(dot)naylor(at)2ndquadrant(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, akim(at)lrde(dot)epita(dot)fr
Subject: Re: Bison state table
Date: 2019-08-13 10:09:30
Message-ID: CACPNZCvzAjiK-CP-qcwmqTrfwM039AGDw2Bq3_aboRot0m1NGA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Jan 26, 2019 at 5:39 AM Bruce Momjian <bruce(at)momjian(dot)us> wrote:
>
> With our scanner keywords list now more cache-aware, and with us
> planning to use Bison for years to come, has anyone ever looked at
> reordering the bison state machine array to be more cache aware, e.g.,
> having common states next to each other rather than scattered around the
> array?

I recently spent some time investigating how the various arrays are
generated and accessed, and found this informative article:

https://www.cs.uic.edu/~spopuri/cparser.html

It's dated from 2006, but still seems to be correct on the whole,
judging by the gram.c output file. Anyway, the short answer is,
grouping common states would require changing Bison's algorithm for
compressing a sparse 2-D array into multiple less-sparse 1-D arrays,
if it's even possible to control the effect you have in mind.

That said, I had an idea. When Bison consults yytable, it has to also
consult yycheck with the same index to make sure the result is valid.
If the two arrays of int16 were merged into a single array of structs,
the state and the check would be on the same cache line. I tried this
by directly patching the gram.c output file (see attached for the
basic idea) and #include'-ing a separate file with the merged array.
It didn't improve raw parsing microbenchmarks using information schema
or short pgbench-like queries. So I'm guessing either a). the
microbenchmark already has better cache behavior than real queries so
won't show much difference here, and/or b). the bottleneck is
elsewhere than accessing yytable and yycheck.

In any case, I hope to follow Bison development and test any
performance improvements the maintainers come up with, as that was
reported to be on the roadmap:

https://www.postgresql.org/message-id/74DD0F55-F3CF-447E-ACF2-88C01E42AAC3@lrde.epita.fr

--
John Naylor https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment Content-Type Size
gram-combine.patch application/octet-stream 1.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Graham Leggett 2019-08-13 10:21:37 Re: Feature: Use DNS SRV records for connecting
Previous Message Feike Steenbergen 2019-08-13 09:50:18 Feature: Use DNS SRV records for connecting