Re: Replacing plpgsql's lexer

From: Greg Stark <stark(at)enterprisedb(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Replacing plpgsql's lexer
Date: 2009-04-15 10:50:08
Message-ID: 4136ffa0904150350j222f1e6brfdca27652d3986fd@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Apr 15, 2009 at 11:33 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>
>> This is a fundamental conflict, not one that has a single simple answer.
>>
>> However this seems like a strange place to pick your battle.
>
> I think you are right that you perceive a fundamental conflict and most
> things I say become battles. That is not my choice and I will withdraw
> from further discussion. My point has been made clearly and has not been
> made to cause conflict. I've better things to do with my time than that,
> though it's a shame you think that of me.

Uhm, I didn't intend this as criticism at all, except inasmuch as the
judgement about whether the plpgsql lexer was a good choice of place
to make this stand. The use of "battle" was only because of the idiom
"pick your battle".

I think we are in general too conservative about making changes and
you are concerned that we're not giving enough thought to the upgrade
pain and should be more conservative. We can talk about general
policies but ultimately we'll have to debate each change on its
merits.

In this case it would help if we described the specific kinds of code
and consequences users. I'm not sure we're all on the same page.

I think changing the lexer to match the SQL lexer will only affect
string constants and only if standards_conforming_strings is enabled,
and only those instances which are handled internally to plpgsql and
not passed to the SQL engine. So the fix will pretty much always be
local to the behaviour change. It's possible for an escaped string to
need an E'' and for the backslash to migrate to other parts of the
code before triggering a bug (or possibly even get stored in the
database and cause a problem in other parts of the application). But
it should still be pretty straightforward to find the original source
of the string and also pretty easy to recognize string constants
throughout the source code.

As it currently stands a programmer sometimes has to use E'\x' and
sometimes has to use '\x' depending on whether the plpgsql is lexing
the string or is passing it to the SQL engine unlexed. It's not
obvious which parts get handled in which way to a user since some
constructs are handled as SQL which don't appear to be SQL and vice
versa -- at least it's not obvious to me even having read the source
in the past.

If I understand things correctly I think the change improves the
language for future users by far more than it imposes maintenance
costs on existing users, especially considering that anyone depending
on '\x' strings with standards_conforming_strings enabled is only
probably getting it wrong in some places without realizing it anyways

.

--
greg

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Christian Schröder 2009-04-15 11:03:53 Performance of full outer join in 8.3
Previous Message Simon Riggs 2009-04-15 10:33:19 Re: Replacing plpgsql's lexer