Re: SQL/JSON functions vs. ECPG vs. STRING as a reserved word

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Michael Meskes <meskes(at)postgresql(dot)org>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: SQL/JSON functions vs. ECPG vs. STRING as a reserved word
Date: 2022-05-30 21:20:15
Message-ID: 4158666.1653945615@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Michael Meskes <meskes(at)postgresql(dot)org> writes:
>> This seems to be because what follows ecpgstart can be either a general
>> SQL statement or an ECPGVarDeclaration (beginning with var_type),
>> and bison isn't smart enough to disambiguate. I have a feeling that
>> this situation could be improved with enough elbow grease, because
>> plpgsql manages to solve a closely-related problem in allowing its
>> assignment statements to coexist with general SQL statements.

> Right, the reason for all this is that the part after the "exec sql" could be
> either language, SQL or C. It has been like this for all those years. I do not
> claim that the solution we have is the best, it's only the best I could come up
> with when I implemented it. If anyone has a better way, please be my guest.

I pushed the proposed patch, but after continuing to think about
it I have an idea for a possible solution to the older problem.
I noticed that the problematic cases in var_type aren't really
interested in seeing any possible unreserved keyword: they care about
certain specific built-in type names (most of which are keywords
already) and about typedef names. Now, almost every C-parsing program
I've ever seen has to lex typedef names specially, so what if we made
ecpg do that too? After a couple of false starts, I came up with the
attached draft patch. The key ideas are:

1. In pgc.l, if an identifier is a typedef name, ignore any possible
keyword meaning and return it as an IDENT. (I'd originally supposed
that we'd want to return some new TYPEDEF token type, but that does
not seem to be necessary right now, and adding a new token type would
increase the patch footprint quite a bit.)

2. In the var_type production, forget about ECPGColLabel[Common]
and just handle the keywords we know we need, plus IDENT for the
typedef case. It turns out that we have to have duplicate coding
because most of these keywords are not keywords in C lexing mode,
so that they'll come through the IDENT path anyway when we're
in a C rather than SQL context. That seemed acceptable to me.
I thought about adding them all to the C keywords list but that
seemed likely to have undesirable side-effects, and again it'd
bloat the patch footprint.

This fix is not without downsides. Disabling recognition of
keywords that match typedefs means that, for example, if you
declare a typedef named "work" then ECPG will fail to parse
"EXEC SQL BEGIN WORK". So in a real sense this is just trading
one hazard for another. But there is an important difference:
with this, whether your ECPG program works depends only on what
typedef names and SQL commands are used in the program. If
it compiles today it'll still compile next year, whereas with
the present implementation the addition of some new unreserved
SQL keyword could break it. We'd have to document this change
for sure, and it wouldn't be something to back-patch, but it
seems like it might be acceptable from the users' standpoint.

We could narrow (not eliminate) this hazard if we could get the
typedef lookup in pgc.l to happen only when we're about to parse
a var_type construct. But because of Bison's lookahead behavior,
that seems to be impossible, or at least undesirably messy
and fragile. But perhaps somebody else will see a way.

Anyway, this seems like too big a change to consider for v15,
so I'll stick this patch into the v16 CF queue. It's only
draft quality anyway --- lacks documentation changes and test
cases. There are also some coding points that could use review.
Notably, I made the typedef lookup override SQL keywords but
not C keywords; this is for consistency with the C-mode lookup
rules, but is it the right thing?

regards, tom lane

Attachment Content-Type Size
handle-unreserved-typedef-names-1.patch text/x-diff 10.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2022-05-30 21:37:16 Re: PG15 beta1 sort performance regression due to Generation context change
Previous Message Andres Freund 2022-05-30 19:57:58 Re: Ignoring BRIN for HOT udpates seems broken