Re: Parser Cruft in gram.y

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Kevin Grittner <kgrittn(at)mail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Parser Cruft in gram.y
Date: 2012-12-18 22:10:29
Message-ID: CA+TgmoYORJbk+bKs=n834u2bQncQ=gnrKNGtWYLpkePJrowekA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Dec 18, 2012 at 4:33 AM, Dimitri Fontaine
<dimitri(at)2ndquadrant(dot)fr> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> And on the other hand, if you could get a clean split between the two
>> grammars, then regardless of exactly what the split was, it might seem
>> a win. But it seemed to me when I looked at this that you'd have to
>> duplicate a lot of stuff and the small parser still wouldn't end up
>> being very small, which I found hard to get excited about.
>
> I think the goal is not so much about getting a much smaller parser, but
> more about have a separate parser that you don't care about the "bloat"
> of, so that you can improve DDL without fearing about main parser
> performance regressions.

Well that would be nice, but the problem is that I see no way to
implement it. If, with a unified parser, the parser is 14% of our
source code, then splitting it in two will probably crank that number
up well over 20%, because there will be duplication between the two.
That seems double-plus un-good.

I can't help but suspect that the way we handle keywords today is
monumentally inefficient. The unreserved_keyword products, et al,
just seem somehow badly wrong-headed. We take the trouble to
distinguish all of those cases so that we an turn around and not
distinguish them. I feel like there ought to be some way to use lexer
states to handle this - if we're in a context where an unreserved
keyword will be treated as an IDENT, then have the lexer return IDENT
when it sees an unreserved keyword. I might be wrong, but it seems
like that would eliminate a whole lot of parser state transitions.
However, even if I'm right, I have no idea how to implement it. It
just seems very wasteful that we have so many parser states that have
no purpose other than (effectively) to convert an unreserved_keyword
into an IDENT when the lexer could do the same thing much more cheaply
given a bit more context.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2012-12-18 22:24:31 Re: Parser Cruft in gram.y
Previous Message Kohei KaiGai 2012-12-18 21:39:55 Re: [v9.3] writable foreign tables