Re: Escape handling in COPY, strings, psql

From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Sergey Ten <sergey(at)sourcelabs(dot)com>, "'Christopher Kings-Lynne'" <chriskl(at)familyhealth(dot)com(dot)au>, jason(at)sourcelabs(dot)com
Subject: Re: Escape handling in COPY, strings, psql
Date: 2005-05-30 03:50:53
Message-ID: 200505300350.j4U3orF01875@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches

Peter Eisentraut wrote:
> Bruce Momjian wrote:
> > > I am thinking the only clean solution is to add a special keyword
> > > like ESCAPE before strings that contain escape information. I
> > > think a GUC is too general. You know if the string is a constant
> > > if it contains escapes just by looking at it, and if it is a
> > > variable, hopefully you know if it has escapes.
>
> I do support gradually phasing out backslash escapes in standard string
> literals in the interest of portability. Most of the current escape
> sequences are of limited value anyway. Let's think about ways to get
> there:
>
> Enabling escape sequences in string literals controls the formatting of
> input (and output?) data, so it is akin to, say, the client encoding
> and the date style, so a GUC variable isn't out of the question in my
> mind. It makes most sense, though, if we want to eventually make users
> switch it off all the time, that is, as a transition aid. But before
> that can happen, we need to come up with an alternative mechanism to
> enter weird characters.
>
> One such way may be to provide functions (say, chr(), tab(), etc.) to
> give access to unprintable characters, but that will result in terrible
> performance for long strings and it also won't help with COPY or places
> where only literals are allowed.
>
> Another way would be to allow escape sequences only in specially marked
> strings. The proposal above doing 'foo' ESCAPE 'x' seems fairly
> elegant for SQL linguists but would be pretty weird to implement in the
> lexer. It won't help with COPY either, but that is really the case for
> all solutions.

I was suggesting ESCAPE 'string' or ESC 'string'. The marker has to be
before the string so scan.l can alter its processing of the string ---
after the string is too late --- there is no way to undo any escaping
that has happened, and it might already be used by gram.y.

I could probably hack up a sample implementation if people are
interested.

> A more compact representation may be using a prefix letter, like E'foo'.
> This fits the SQL syntax, is familiar with Python programmers (although
> in the other direction), and can be implemented efficiently in the
> lexer. I like that the best, personally.
>
> For COPY, we would probably have to use a flag in the COPY command
> itself either way (like already done for NULL AS).

I agree with Tom that COPY has to be left unchanged. The fundamental
problem is the representation of NULL values, that I don't think we can
do without some escape mechanism. Single-quote escapes works by
doubling them, but once you need to represent something more like
null's, I can't think of a solution without escapes.

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2005-05-30 04:04:31 Re: Escape handling in COPY, strings, psql
Previous Message Tom Lane 2005-05-30 02:33:39 Re: locks in CREATE TRIGGER, ADD FK

Browse pgsql-patches by date

  From Date Subject
Next Message Bruce Momjian 2005-05-30 04:04:31 Re: Escape handling in COPY, strings, psql
Previous Message Tom Lane 2005-05-30 03:09:25 Re: skip FK trigger on UPDATE