From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | pgsql-committers(at)postgresql(dot)org |
Subject: | pgsql: Improve parser's one-extra-token lookahead mechanism. |
Date: | 2015-02-24 22:54:04 |
Message-ID: | E1YQOMe-0001Od-QY@gemulon.postgresql.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-committers |
Improve parser's one-extra-token lookahead mechanism.
There are a couple of places in our grammar that fail to be strict LALR(1),
by requiring more than a single token of lookahead to decide what to do.
Up to now we've dealt with that by using a filter between the lexer and
parser that merges adjacent tokens into one in the places where two tokens
of lookahead are necessary. But that creates a number of user-visible
anomalies, for instance that you can't name a CTE "ordinality" because
"WITH ordinality AS ..." triggers folding of WITH and ORDINALITY into one
token. I realized that there's a better way.
In this patch, we still do the lookahead basically as before, but we never
merge the second token into the first; we replace just the first token by
a special lookahead symbol when one of the lookahead pairs is seen.
This requires a couple extra productions in the grammar, but it involves
fewer special tokens, so that the grammar tables come out a bit smaller
than before. The filter logic is no slower than before, perhaps a bit
faster.
I also fixed the filter logic so that when backing up after a lookahead,
the current token's terminator is correctly restored; this eliminates some
weird behavior in error message issuance, as is shown by the one change in
existing regression test outputs.
I believe that this patch entirely eliminates odd behaviors caused by
lookahead for WITH. It doesn't really improve the situation for NULLS
followed by FIRST/LAST unfortunately: those sequences still act like a
reserved word, even though there are cases where they should be seen as two
ordinary identifiers, eg "SELECT nulls first FROM ...". I experimented
with additional grammar hacks but couldn't find any simple solution for
that. Still, this is better than before, and it seems much more likely
that we *could* somehow solve the NULLS case on the basis of this filter
behavior than the previous one.
Branch
------
master
Details
-------
http://git.postgresql.org/pg/commitdiff/d809fd0008a2e26de463f47b7aba0365264078f3
Modified Files
--------------
src/backend/parser/gram.y | 35 ++++++---
src/backend/parser/parser.c | 105 ++++++++++++++-----------
src/include/parser/gramparse.h | 2 +
src/interfaces/ecpg/preproc/parse.pl | 6 +-
src/interfaces/ecpg/preproc/parser.c | 116 ++++++++++++++++------------
src/test/regress/expected/foreign_data.out | 2 +-
src/test/regress/expected/with.out | 15 ++++
src/test/regress/sql/with.sql | 5 ++
8 files changed, 173 insertions(+), 113 deletions(-)
From | Date | Subject | |
---|---|---|---|
Next Message | Fabien COELHO | 2015-02-25 08:31:51 | Re: pgsql: Support more commands in event triggers |
Previous Message | Peter Eisentraut | 2015-02-24 21:01:39 | Re: pgsql: Error when creating names too long for tar format |