Re: Some surprising precedence behavior in PG's grammar

From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: Some surprising precedence behavior in PG's grammar
Date: 2011-05-05 00:41:11
Message-ID: 4DC1F227.707@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 05/04/2011 07:39 PM, Tom Lane wrote:
> While looking at the grammar's operator-precedence declarations in
> connection with a recent pgsql-docs question, it struck me that this
> declaration is a foot-gun waiting to go off:
>
> %nonassoc IS NULL_P TRUE_P FALSE_P UNKNOWN /* sets precedence for IS NULL, etc */
>
> The only terminal that we actually need to attach precedence to for
> IS NULL and related productions is "IS". The others are just listed
> there to save attaching explicit %prec declarations to those productions.
> This seems like a bad idea, because attaching a precedence to a terminal
> symbol that doesn't absolutely have to have one is just asking for
> trouble: it can cause bison to accept grammars that would better have
> provoked a shift/reduce error, and to silently resolve the ambiguity in
> a way that you maybe didn't expect.
>
> So I thought to myself that it'd be better to remove the unnecessary
> precedence markings, and tried, with the attached patch. And behold,
> I got a shift/reduce conflict:
>
> state 2788
>
> 1569 b_expr: b_expr qual_Op . b_expr
> 1571 | b_expr qual_Op .
>
> NULL_P shift, and go to state 1010
> NULL_P [reduce using rule 1571 (b_expr)]
>
> So the god of unintended consequences has been here already. What this
> is telling us is that in examples such as
>
> CREATE TABLE foo (f1 int DEFAULT 10 %% NULL);
>
> it is not immediately clear to bison whether to shift upon seeing the
> NULL (which leads to a parse tree that says %% is an infix operator with
> arguments 10 and NULL), or to reduce (which leads to a parse tree that
> says that %% is a postfix operator with argument 10, and NULL is a
> column declaration constraint separate from the DEFAULT expression).
>
> If you try the experiment, you find out that the first interpretation is
> preferred by the current grammar:
>
> ERROR: operator does not exist: integer %% unknown

Yeah, IIRC the default action for a shift/reduce conflict is to shift,
as it's usually the right thing to do.

> Now, this is probably a good thing, because NULL as a column declaration
> constraint is not SQL standard (only NOT NULL is), so we're resolving
> the ambiguity in a way that's more likely to be SQL-compatible. But it
> could be surprising to somebody who expected the other behavior,
> especially since this seemingly-closely-related command is parsed the
> other way:
>
> CREATE TABLE foo (f1 int DEFAULT 10 %% NOT NULL);
> ERROR: operator does not exist: integer %%
>
> And the reason for that difference in behavior is that NOT has a
> declared precedence lower than POSTFIXOP, whereas NULL has a declared
> precedence that's higher. That comparison determines how bison resolves
> the shift/reduce conflict.
>
> The fact that this behavior falls out of a precedence declaration that's
> seemingly quite unrelated, and was surely not designed with this case in
> mind, is a perfect example of why I say that precedence declarations are
> hazardous.
>
> So I'd still like to get rid of the precedence markings for TRUE_P,
> FALSE_P, and UNKNOWN, and will do so unless somebody has a really good
> reason not to. (It looks like we could avoid marking ZONE, too.) But
> I would be happier if we could also not mark NULL, because that's surely
> used in a lot of other places, and could easily bite us a lot harder
> than this. Can anyone think of an alternative way to resolve this
> particular conflict without the blunt instrument of a precedence marking?
>
>

My bison-fu is a bit rusty, but years ago I could do this stuff in my
sleep. I'll be surprised if there isn't a way.

If we do need a precedence setting for NULL_P, then I think it should
probably be on its own and not sharing one with IS.

If you don't solve it soon I'll take a look after I clear a couple of
higher priority items from my list.

cheers

andrew

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2011-05-05 00:44:26 Re: VARIANT / ANYTYPE datatype
Previous Message Tom Lane 2011-05-04 23:50:53 Re: VARIANT / ANYTYPE datatype