Re: benchmarking Flex practices

From: John Naylor <john(dot)naylor(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: benchmarking Flex practices
Date: 2019-12-03 11:02:17
Message-ID: CACPNZCv9fWE9Q8yVwHgQ4=5y3RBi2CkzoG2MD8Q_3ftNVt0sCg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Nov 26, 2019 at 10:32 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> I haven't looked closely at what ecpg does with the processed
> identifiers. If it just spits them out as-is, a possible solution
> is to not do anything about de-escaping, but pass the sequence
> U&"..." (plus UESCAPE ... if any), just like that, on to the grammar
> as the value of the IDENT token.

It does pass them along as-is, so I did it that way.

In the attached v10, I've synced both ECPG and psql.

> * I haven't convinced myself either way as to whether it'd be
> better to factor out the code duplicated between the UIDENT
> and UCONST cases in base_yylex.

I chose to factor it out, since we have 2 versions of parser.c, and
this way was much easier to work with.

Some notes:

I arranged for the ECPG grammar to only see SCONST and IDENT. With
UCONST and UIDENT out of the way, it was a small additional step to
put all string reconstruction into the lexer, which has the advantage
of allowing removal of the other special-case ECPG string tokens as
well. The fewer special cases involved in pasting the grammar
together, the better. In doing so, I've probably introduced memory
leaks, but I wanted to get your opinion on the overall approach before
investigating.

In ECPG's parser.c, I simply copied check_uescapechar() and
ecpg_isspace(), but we could find a common place if desired. During
development, I found that this file replicates the location-tracking
logic in the backend, but doesn't seem to make use of it. I also would
have had to replicate the backend's datatype for YYLTYPE. Fixing that
might be worthwhile some day, but to get this working, I just ripped
out the extra location tracking.

I no longer use state variables to track scanner state, and in fact I
removed the existing "state_before" variable in ECPG. Instead, I used
the Flex builtins yy_push_state(), yy_pop_state(), and yy_top_state().
These have been a feature for a long time, it seems, so I think we're
okay as far as portability. I think it's cleaner this way, and
possibly faster. I also used this to reunite the xcc and xcsql states.
This whole part could be split out into a separate refactoring patch
to be applied first, if desired.

--
John Naylor https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment Content-Type Size
v10-handle-uescapes-in-parser.patch application/octet-stream 61.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pengzhou Tang 2019-12-03 11:05:34 Errors when update a view with conditional-INSTEAD rules
Previous Message Amit Kapila 2019-12-03 10:56:49 Re: [HACKERS] Block level parallel vacuum