Re: hint infrastructure setup (v3)

From: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL Patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: hint infrastructure setup (v3)
Date: 2004-04-06 08:18:11
Message-ID: Pine.LNX.4.58.0404060901020.8826@sablons.cri.ensmp.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-patches


Dear Tom,

> > I join a small proof-of-concept patch to drop some tokens out of the
> > parser.
>
> I believe these were treated this way *specifically* because of the
> keyword-is-not-an-identifier issue. SQL99 calls out most of these
> as being keywords:

Well, I think that the "reserved keywords" are fine as tokens in a
lexer/parser, but think that the "unreserved keywords" should be dropped
of the token status if possible.

> and if we don't treat them as keywords then we will have a couple of
> problems. One is case-conversion issues in locales where the standard
> downcasing is not an extension of ASCII (Turkish is the only one I know
> of offhand).

Do you mean it should use an ASCII-only strcasecmp, not a possibly
"localised" version? I agree, but this is just a "proof of concept"
patch to show that you don't need so many tokens in the parser.

> Another is that depending on where you put the renaming that this patch
> removes without replacing :-(,

I do not understand your point. It seems to me that the renaming is
performed when a type name is expected? The "boolean" keyword (not token)
is translated to system "bool" type in the GenericType rule?? ???

> it would be possible for the renaming transformation to get applied to
> user-defined types with similar names, or for user-defined types to
> unexpectedly shadow system definitions.

I don't think that the patch changes the result of the parsing. It drops
*TOKENS* out of the lexer, but they are still *KEYWORDS*, although they
are not explicitly in the lexer list.

"keyword.c" deals with tokens, the file name was ill-chosen. If you think
that keywords can only be lexical tokens, then you end-up with an
automaton larger than necessary, IMVHO.

Note that the removed tokens are still "keywords" as they are treated
*especially* anyway. It is not a semantical transformation.

Also, if you don't want these names as candidate function names, they
could be filtered out at some other point in the parser. They really don't
need to be special tokens.

My point is that you can have the very same *semantical* result with a
smaller automaton if you chose a different trade-off within the
lexer/parser/post filtering. I don't want to change the language.

> The former would be surprising and the latter would violate the spec.

I'm really not sure this is the case with the patch I sent.

> Check the archives; I'm sure this was discussed in the 7.3 development
> cycle and we concluded that treating these names as keywords was the
> only practical solution.

Hmmm... I can check the archive, but I cannot see how different the
language is with the patch. Maybe there is a missing filter out, or
strcasecmp is not the right version, but no more.

I think it is a small technical issue in the parser internals, and has
nothing to do with great principles and whether this or that is a keyword.
It's about what keywords need to be tokens.

--
Fabien Coelho - coelho(at)cri(dot)ensmp(dot)fr

In response to

Responses

Browse pgsql-patches by date

  From Date Subject
Next Message Fabien COELHO 2004-04-06 08:59:12 Re: hint infrastructure setup (v3)
Previous Message Bruce Momjian 2004-04-06 04:07:40 Re: psql's "\d" and CLUSTER