Re: Bison state table

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: John Naylor <john(dot)naylor(at)2ndquadrant(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, akim(at)lrde(dot)epita(dot)fr
Subject: Re: Bison state table
Date: 2019-08-13 16:36:14
Message-ID: 20190813163614.vxqbnjhophgipu62@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Aug 13, 2019 at 05:09:30PM +0700, John Naylor wrote:
> On Sat, Jan 26, 2019 at 5:39 AM Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> >
> > With our scanner keywords list now more cache-aware, and with us
> > planning to use Bison for years to come, has anyone ever looked at
> > reordering the bison state machine array to be more cache aware, e.g.,
> > having common states next to each other rather than scattered around the
> > array?
>
> I recently spent some time investigating how the various arrays are
> generated and accessed, and found this informative article:
>
> https://www.cs.uic.edu/~spopuri/cparser.html
>
> It's dated from 2006, but still seems to be correct on the whole,
> judging by the gram.c output file. Anyway, the short answer is,
> grouping common states would require changing Bison's algorithm for
> compressing a sparse 2-D array into multiple less-sparse 1-D arrays,
> if it's even possible to control the effect you have in mind.
>
> That said, I had an idea. When Bison consults yytable, it has to also
> consult yycheck with the same index to make sure the result is valid.
> If the two arrays of int16 were merged into a single array of structs,
> the state and the check would be on the same cache line. I tried this
> by directly patching the gram.c output file (see attached for the
> basic idea) and #include'-ing a separate file with the merged array.
> It didn't improve raw parsing microbenchmarks using information schema
> or short pgbench-like queries. So I'm guessing either a). the
> microbenchmark already has better cache behavior than real queries so
> won't show much difference here, and/or b). the bottleneck is
> elsewhere than accessing yytable and yycheck.
>
> In any case, I hope to follow Bison development and test any
> performance improvements the maintainers come up with, as that was
> reported to be on the roadmap:

Very interesting. Thanks for researching this.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Ancient Roman grave inscription +

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2019-08-13 16:52:35 Re: pg_upgrade fails with non-standard ACL
Previous Message Jeff Davis 2019-08-13 16:25:53 Re: Add "password_protocol" connection parameter to libpq