Re: Upgrading our minimum required flex version for 8.5

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Upgrading our minimum required flex version for 8.5
Date: 2009-07-12 05:13:38
Message-ID: 7484.1247375618@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
> Tom Lane wrote:
>> Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
>>> I think it would need to be benchmarked. My faint recollection is that
>>> the re-entrant lexers are slower.
>>
>> The flex documentation states in so many words:
>> The option `--reentrant' does not affect the performance of the scanner.
>> Do you feel a need to verify their claim?

> No, I'll take their word for it. I must have been thinking of something
> else.

As I got further into this, it turned out that Andrew's instinct was
right: it does need to be benchmarked. Although the inner loops of the
lexer seem to be the same with or without --reentrant, once you buy into
the whole nine yards of --reentrant, --bison-bridge, and a "pure" bison
parser, you find out that the lexer's API changes: there are more
parameters to yylex() than there used to be. It's also necessary to
pass around a yyscanner pointer to all the subroutines in scan.l. (But
on the other hand, this eliminates accesses to global variables, which
are often not that cheap.) So the "no performance impact" claim isn't
telling the whole truth.

As best I can tell after some casual testing on a couple of machines,
the actual bottom line is that "raw_parser" (ie, the bison and flex
processing) is going to be a couple of percent slower with a reentrant
grammar and lexer, for typical queries involving a lot of short tokens.
Now this disappears into the noise as soon as you include parse analysis
(let alone planning and execution), but it is possible to measure the
slowdown in a test harness that calls raw_parser only.

A possible compromise that I think would avoid most or all of the
slowdown is to make the lexer reentrant but not the grammar (so that
yylval and yylloc remain as global variables instead of being parameters
to yylex). I haven't actually benchmarked that, though. It strikes
me as a fairly silly thing to do. If we're going to go for reentrancy
I think we should fix both components.

I'm willing to live with the small slowdown. Comments?

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2009-07-12 07:09:27 Re: Upgrading our minimum required flex version for 8.5
Previous Message Robert Haas 2009-07-12 02:40:09 Re: First CommitFest: July 15th