Re: SQL/JSON functions vs. ECPG vs. STRING as a reserved word

From: Noah Misch <noah(at)leadboat(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Michael Meskes <meskes(at)postgresql(dot)org>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: SQL/JSON functions vs. ECPG vs. STRING as a reserved word
Date: 2022-07-03 08:01:27
Message-ID: 20220703080127.GA2476530@rfd.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, May 30, 2022 at 05:20:15PM -0400, Tom Lane wrote:

[allow EXEC SQL TYPE unreserved_keyword IS ...]

> 1. In pgc.l, if an identifier is a typedef name, ignore any possible
> keyword meaning and return it as an IDENT. (I'd originally supposed
> that we'd want to return some new TYPEDEF token type, but that does
> not seem to be necessary right now, and adding a new token type would
> increase the patch footprint quite a bit.)
>
> 2. In the var_type production, forget about ECPGColLabel[Common]
> and just handle the keywords we know we need, plus IDENT for the
> typedef case. It turns out that we have to have duplicate coding
> because most of these keywords are not keywords in C lexing mode,
> so that they'll come through the IDENT path anyway when we're
> in a C rather than SQL context. That seemed acceptable to me.
> I thought about adding them all to the C keywords list but that
> seemed likely to have undesirable side-effects, and again it'd
> bloat the patch footprint.
>
> This fix is not without downsides. Disabling recognition of
> keywords that match typedefs means that, for example, if you
> declare a typedef named "work" then ECPG will fail to parse
> "EXEC SQL BEGIN WORK". So in a real sense this is just trading
> one hazard for another. But there is an important difference:
> with this, whether your ECPG program works depends only on what
> typedef names and SQL commands are used in the program. If
> it compiles today it'll still compile next year, whereas with
> the present implementation the addition of some new unreserved
> SQL keyword could break it. We'd have to document this change
> for sure, and it wouldn't be something to back-patch, but it
> seems like it might be acceptable from the users' standpoint.

I agree this change is more likely to please a user than to harm a user. The
user benefit is slim, but the patch is also slim.

> We could narrow (not eliminate) this hazard if we could get the
> typedef lookup in pgc.l to happen only when we're about to parse
> a var_type construct. But because of Bison's lookahead behavior,
> that seems to be impossible, or at least undesirably messy
> and fragile. But perhaps somebody else will see a way.

I don't, though I'm not much of a Bison wizard.

> Anyway, this seems like too big a change to consider for v15,
> so I'll stick this patch into the v16 CF queue. It's only
> draft quality anyway --- lacks documentation changes and test
> cases. There are also some coding points that could use review.
> Notably, I made the typedef lookup override SQL keywords but
> not C keywords; this is for consistency with the C-mode lookup
> rules, but is it the right thing?

That decision seems fine. ScanCKeywordLookup() covers just twenty-six
keywords, and that list hasn't changed since 2003. Moreover, most of them are
keywords of the C language itself, so allowing them would entailing mangling
them in the generated C to avoid C compiler errors. Given the lack of
complaints, let's not go there.

I didn't locate any problems beyond the test and doc gaps that you mentioned,
so I've marked this Ready for Committer.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Noah Misch 2022-07-03 08:32:17 Re: 15beta1 tab completion of extension versions
Previous Message Tom Lane 2022-07-03 03:59:58 Re: Probable memory leak with ECPG and AIX