Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
Cc: pgsql-hackers(at)postgresql(dot)org, david(at)fetter(dot)org, Oliver Ford <ojford(at)gmail(dot)com>, Krasiyan Andreev <krasiyan(at)gmail(dot)com>
Subject: Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options
Date: 2018-09-23 23:10:25
Message-ID: 9361.1537744225@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk> writes:
> "Tom" == Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:
> Tom> The FROM FIRST/LAST bit seems particularly badly thought through,
> Tom> because AFAICS it is flat out ambiguous with a normal FROM clause
> Tom> immediately following the window function call. The only way to
> Tom> make it not so would be to make FIRST and LAST be fully reserved,
> Tom> which is neither a good idea nor spec-compliant.

> In the actual spec syntax it's not ambiguous at all because NTH_VALUE is
> a reserved word (as are LEAD, LAG, FIRST_VALUE and LAST_VALUE), and OVER
> is a mandatory clause in its syntax, so a FROM appearing before the OVER
> must be part of a FROM FIRST/LAST and not introducing a FROM-clause.

Hmm ...

> In our syntax, if we made NTH_VALUE etc. a col_name_keyword (and thus
> not legal as a function name outside its own special syntax) it would
> also become unambiguous.
> i.e. given this token sequence (with . marking the current posision):
> select nth_value(x) . from first ignore
> if we know up front that "nth_value" is a window function and not any
> other kind of function, we know that we have to shift the "from" rather
> than reducing the select-list because we haven't seen an "over" yet.

I don't really find that to be a desirable solution, because quite aside
from the extensibility problem, it would mean that a lot of errors
become "syntax error" where we formerly gave a more useful message.

This does open up a thought about how to proceed, though. I'd been
trying to think of a way to solve this using base_yylex's ability to
do some internal lookahead and change token types based on that.
If you just think of recognizing FROM FIRST/LAST, you get nowhere
because that's still legal in other contexts. But if you were to
look for FROM followed by FIRST/LAST followed by IGNORE/RESPECT/OVER,
I think that could only validly happen in this syntax. It'd take
some work to extend base_yylex to look ahead 2 tokens not one, but
I'm sure that could be done. (You'd also need a lookahead rule to
match "IGNORE/RESPECT NULLS OVER", but that seems just as doable.)
Then the relevant productions use FROM_LA, IGNORE_LA, RESPECT_LA
instead of the corresponding bare tokens, and the grammar no longer
has an ambiguity problem.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2018-09-23 23:28:45 Re: when set track_commit_timestamp on, database system abort startup
Previous Message Michael Paquier 2018-09-23 22:40:47 Re: Changing the setting of wal_sender_timeout per standby