Re: hint infrastructure setup (v3)

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
Cc: PostgreSQL Patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: hint infrastructure setup (v3)
Date: 2004-04-03 18:33:25
Message-ID: 17150.1081017205@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-patches

Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr> writes:
> That is done with YYERROR_VERBOSE, but the result is really poor
> most of the time, because it does not look for all possible terminals,
> just the ones easilly accessible.

I wasn't aware that bison had a built-in facility for better messages
--- this is kind of cool actually. I played with it a little and found
that:

1. There's an arbitrary restriction in the bison code to show no more
than four alternatives; if there's more than four then it shows none
of them, rather than doing something helpful like "etc."

2. The problem you exhibit with DROP seems to be due to our use of
empty productions:

> DROP TABL foo;
> ERROR: syntax error at unexpected Ident "TABL", LANGUAGE expected.

It looks like the parser makes one additional state move, reducing
"opt_procedural" to "empty", before raising the error. We might be
able to suppress that; alternatively we could look at the stacked
state(s) and add their follow sets to the printout. In any case it
is clear that we can extract the follow set from the tables.

A bigger problem with this, though, is the verbosity of the results
in some cases. I diked out the limitation to four outputs and soon
found examples like

regression=# grant;
ERROR: syntax error, unexpected ';', expecting ALL or CREATE or DELETE_P or EXECUTE or INSERT or REFERENCES or RULE or SELECT or TEMP or TEMPORARY or TRIGGER or UPDATE or USAGE at or near ";" at character 6
LINE 1: grant;
^

which is useful, and

regression=# grant select on t to;
ERROR: syntax error, unexpected ';', expecting ABORT_P or ABSOLUTE_P or ACCESS or ACTION or ADD or AFTER or AGGREGATE or ALTER or ASSERTION or ASSIGNMENT or AT or BACKWARD or BEFORE or BEGIN_P or BIGINT or BIT or BOOLEAN_P or BY or CACHE or CALLED or CASCADE or CHAIN or CHAR_P or CHARACTER or CHARACTERISTICS or CHECKPOINT or CLASS or CLOSE or CLUSTER or COALESCE or COMMENT or COMMIT or COMMITTED or CONSTRAINTS or CONVERSION_P or CONVERT or COPY or CREATEDB or CREATEUSER or CURSOR or CYCLE or DATABASE or DAY_P or DEALLOCATE or DEC or DECIMAL_P or DECLARE or DEFAULTS or DEFERRED or DEFINER or DELETE_P or DELIMITER or DELIMITERS or DOMAIN_P or DOUBLE_P or DROP or EACH or ENCODING or ENCRYPTED or ESCAPE or EXCLUDING or EXCLUSIVE or EXECUTE or EXISTS or EXPLAIN or EXTERNAL or EXTRACT or FETCH or FIRST_P or FLOAT_P or FORCE or FORWARD or FUNCTION or GLOBAL or GROUP_P or HANDLER or HOLD or HOUR_P or IMMEDIATE or IMMUTABLE or IMPLICIT_P or INCLUDING or INCREMENT or INDEX or INHERIT!
S or INOUT or INPUT_P or INSENSITIVE or INSERT or INSTEAD or INT_P or INTEGER or INTERVAL or INVOKER or ISOLATION or KEY or LANCOMPILER or LANGUAGE or LARGE_P or LAST_P or LEVEL or LISTEN or LOAD or LOCAL or LOCATION or LOCK_P or MATCH or MAXVALUE or MINUTE_P or MINVALUE or MODE or MONTH_P or MOVE or NAMES or NATIONAL or NCHAR or NEXT or NO or NOCREATEDB or NOCREATEUSER or NONE or NOTHING or NOTIFY or NULLIF or NUMERIC or OBJECT_P or OF or OIDS or OPERATOR or OPTION or OUT_P or OVERLAY or OWNER or PARTIAL or PASSWORD or PATH_P or PENDANT or POSITION or PRECISION or PRESERVE or PREPARE or PRIOR or PRIVILEGES or PROCEDURAL or PROCEDURE or READ or REAL or RECHECK or REINDEX or RELATIVE_P or RENAME or REPEATABLE or REPLACE or RESET or RESTART or RESTRICT or RETURNS or REVOKE or ROLLBACK or ROW or ROWS or RULE or SCHEMA or SCROLL or SECOND_P or SECURITY or SEQUENCE or SERIALIZABLE or SESSION or SET or SETOF or SHARE or SHOW or SIMPLE or SMALLINT or STABLE or START or STATEMENT o!
r STATISTICS or STDIN or STDOUT or STORAGE or STRICT_P or SUBSTRING or
SYSID or TEMP or TEMPLATE or TEMPORARY or TIME or TIMESTAMP or TOAST or TRANSACTION or TREAT or TRIGGER or TRIM or TRUNCATE or TRUSTED or TYPE_P or UNCOMMITTED or UNENCRYPTED or UNKNOWN or UNLISTEN or UNTIL or UPDATE or USAGE or VACUUM or VALID or VALIDATOR or VALUES or VARCHAR or VARYING or VERSION or VIEW or VOLATILE or WITH or WITHOUT or WORK or WRITE or YEAR_P or ZONE or IDENT at or near ";" at character 22
LINE 1: grant select on t to;
^

which is not real useful at all :-(. You really want to see just
"expecting IDENT" in such a case. Still we might be able to do some
postprocessing on the collected set of valid follow symbols, such as
removing all the unreserved_keywords when they are present along with
IDENT. It'd be fairly reasonable to embed knowledge about this in
keywords.c and/or scan.l.

We'd have to write our own version of bison's verbose-error code anyway,
because the canned code doesn't support localization --- it uses
hardwired strings for "expecting" and so on. But it looks possibly
doable to me.

regards, tom lane

In response to

Responses

Browse pgsql-patches by date

  From Date Subject
Next Message Magnus Hagander 2004-04-04 20:08:04 New socket code for win32
Previous Message Magnus Hagander 2004-04-03 16:39:04 Re: MSFT compiler fixes + misc