Skip site navigation (1) Skip section navigation (2)

bison location reporting for potentially-empty list productions

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgreSQL(dot)org
Subject: bison location reporting for potentially-empty list productions
Date: 2012-10-04 00:29:24
Message-ID: 10999.1349310564@sss.pgh.pa.us (view raw or flat)
Thread:
Lists: pgsql-hackers
In the just-committed patch for CREATE SCHEMA IF NOT EXISTS, there is
an error thrown by the grammar when IF NOT EXISTS is specified together
with any schema-element clauses.  I thought it would make more sense for
the error cursor to point at the schema-element clauses, rather than at
the IF NOT EXISTS as submitted.  So the code looks like, eg,

            | CREATE SCHEMA IF_P NOT EXISTS ColId OptSchemaEltList
                    ...
                    if ($7 != NIL)
                        ereport(ERROR,
                                (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
                                 errmsg("CREATE SCHEMA IF NOT EXISTS cannot include schema elements"),
                                 parser_errposition(@7)));

However, it turns out this actually results in a cursor pointing at the
previous word:

CREATE SCHEMA IF NOT EXISTS test_schema_1 -- fail, disallowed
       CREATE TABLE abc (
              a serial,
              b int UNIQUE
       );
ERROR:  CREATE SCHEMA IF NOT EXISTS cannot include schema elements
LINE 1: CREATE SCHEMA IF NOT EXISTS test_schema_1 
                                    ^

After some study I think what is happening is that there's a deficiency
in the location-tracking macro in gram.y:

/* Location tracking support --- simpler than bison's default */
#define YYLLOC_DEFAULT(Current, Rhs, N) \
    do { \
        if (N) \
            (Current) = (Rhs)[1]; \
        else \
            (Current) = (Rhs)[0]; \
    } while (0)

OptSchemaEltList looks like this:

OptSchemaEltList:
            OptSchemaEltList schema_stmt            { $$ = lappend($1, $2); }
            | /* EMPTY */                           { $$ = NIL; }
        ;

When the macro is evaluated for the initial empty production for
OptSchemaEltList, the "else" case causes it to index off the beginning
of the array and pick up the location for the preceding token.  Then,
each succeeding reduction of the first part of the production copies
that bogus value from the first RHS member item.  So this same problem
applies not only to OptSchemaEltList but to any case where we've formed
a zero-or-more-element-list production in this style.  So far as I can
tell, there are no other cases in the current grammar where we make use
of the position of a list nonterminal for error-reporting purposes,
which is why we'd not noticed this before.  But it seems likely to come
up again.

It seems clear to me now that the macro ought at least to be changed to

#define YYLLOC_DEFAULT(Current, Rhs, N) \
    do { \
        if (N) \
            (Current) = (Rhs)[1]; \
        else \
            (Current) = -1; \
    } while (0)

so that the parse location attributed to an empty production is -1
(ie, unknown) rather than the current quite bogus behavior of marking
it as the preceding token's location.  (For one thing, what if there
*isn't* a preceding token?  That's not possible I think in the current
grammar, but if it were possible this code would be indexing off the
beginning of the locations array.)

But that change wouldn't really fix the issue for CREATE SCHEMA ---
the -1 would be propagated up to all instances of OptSchemaEltList
even after we'd reduced the first production a few times, so that
we'd end up printing no error cursor for this case.

To produce a really useful cursor for this type of situation I think
we would have to do something like this:

#define YYLLOC_DEFAULT(Current, Rhs, N) \
    do { \
        int i;
        (Current) = -1; \
        for (i = 1; i <= (N); i++)
        {
            (Current) = (Rhs)[i]; \
            if ((Current) >= 0)
                break;
        }
    } while (0)

This is pretty ugly and seems likely to create a visible hit on the
parser's speed (which is already not that good).  I'm not sure it's
worth it for something that seems to be a corner case --- anyway
we've not hit it before in six years since the location tracking
support was added.

At this point I'm inclined to change the macro to the second case
(with the -1) and accept a less good error cursor position for the
CREATE SCHEMA error.

Has anybody got a better idea?

			regards, tom lane


Responses

pgsql-hackers by date

Next:From: Tom LaneDate: 2012-10-04 01:00:27
Subject: Re: Support for REINDEX CONCURRENTLY
Previous:From: David E. WheelerDate: 2012-10-04 00:17:10
Subject: Re: CREATE SCHEMA IF NOT EXISTS

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group