Re: [HACKERS] dollar quoting

From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, "Patches (PostgreSQL)" <pgsql-patches(at)postgresql(dot)org>
Subject: Re: [HACKERS] dollar quoting
Date: 2004-02-14 20:04:31
Message-ID: 402E7F4F.3080300@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches

Tom Lane wrote:

>Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
>
>
>>I ended up not using a regex, which seemed to be a little heavy handed,
>>but just writing a small custom recognition function, that should (and I
>>think does) mimic the pattern recognition for these tokens used by the
>>backend lexer.
>>
>>
>
>I looked at this and realized that it still doesn't do very well at
>distinguishing $foo$ from other random uses of $. The problem is that
>looking back at just the immediately preceding character isn't enough
>context to tell whether a $ is part of an identifier. Consider the
>input
> a42$foo$
>This is a legal identifier according to PG 7.4. But how about
> 42$foo$
>This is a syntax error in 7.4, and we propose to redefine it as an
>integer literal '42' followed by a dollar-quote start symbol.
>

The test in the patch I sent is this:

else if (!dol_quote && valid_dolquote(line+i) &&
(i == 0 ||
! ((line[i-prevlen] & 0x80) != 0 ||
isalnum(line[i-prevlen]) ||
line[i-prevlen] == '_' ||
line[i-prevlen] == '$' )))

The test should not succeed anywhere in the string '42$foo$'.

Note that psql does not change any '$foo$' at all - it just passes it to
the backend. The reason we need this at all in psql is that it has to
detect the end of a statement, and it has to prompt correctly, and to do
that it needs to know if we are in a quote (single, double, dollar) or a
comment.

psql does not detect many syntax errors, or even lexical errors - that
is the job of the backend - rightly so, I believe.

>
>There's no way to tell these apart with a single-character lookback,
>or indeed any fixed number of characters of lookback.
>

I'm still not convinced, although maybe there's something I'm not getting.

>
>I begin to think that we'll really have to bite the bullet and convert
>psql's input parser to use flex. If we're not scanning with exactly the
>same rules as the backend uses, we're going to get the wrong answers.
>
>
>

Interacting with lexer states would probably be ... unpleasant. Matching
a stream oriented lexer with a line oriented CLI would be messy I suspect.

cheers

andrew

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Joshua D. Drake 2004-02-14 21:29:18 Re: Cannot read block error.
Previous Message Jason Essington 2004-02-14 19:04:38 Cannot read block error.

Browse pgsql-patches by date

  From Date Subject
Next Message Thomas Hallgren 2004-02-15 00:11:27 Re: Some new SPI functions
Previous Message Tom Lane 2004-02-14 16:54:09 Re: [HACKERS] dollar quoting