Re: quoting psql varible as identifier

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: quoting psql varible as identifier
Date: 2010-01-18 20:19:06
Message-ID: 603c8f071001181219q4261532ere5e07c67b8fca066@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jan 18, 2010 at 2:20 PM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
> 2010/1/18 Robert Haas <robertmhaas(at)gmail(dot)com>:
>> On Mon, Jan 18, 2010 at 1:52 PM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
>>> 2010/1/18 Robert Haas <robertmhaas(at)gmail(dot)com>:
>>>> On Sun, Jan 17, 2010 at 2:04 PM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
>>>>> I rewrote patch so now interface for PQescapeIdentConn is same as
>>>>> PQescapeStringConn
>>>>>
>>>>> @3. I though so the protection under incomplete multibyte chars are
>>>>> enought - missing bytes are replaced by space - like
>>>>> PQescapeStringConn does.
>>>>
>>>> That much is fine, but the output buffer is only guaranteed to be of
>>>> size 2n+1.  Imagine the input is two double-quotes followed by a byte
>>>> for which pg_encoding_mblen() returns 4.  The input is 3 characters
>>>> long so the user was responsible to provide 7 bytes of output space,
>>>> but you'll try to write 9 bytes to it (including the terminating NUL).
>>>>
>>> I don't understand. The "length" is number of bytes, not number of
>>> chars. It is maybe bad documented only. If your input string has 6
>>> bytes, then buffer have to allocated to 13 bytes. Nobody knows how
>>> much is chars there.
>>
>> Right, but the point is we can't assume that the input is validly
>> encoded.  If the input ends with a garbage character that looks like
>> the start of a multi-byte character, we can't assume that there's
>> enough space in the output buffer to store the required number of
>> padding spaces.
>>
>> To take an extreme example, suppose there were an encoding where any
>> time the first byte of a multi-byte character has the high-bit set,
>> the character is 100 bytes long.  Then suppose someone call
>> PQescapeStringConn(), or this new function we're adding, with a length
>> argument of 1, and the first byte of the input buffer has the high-bit
>> set.  The caller is only required to provide a 3-byte output buffer,
>> and the third byte is needed for the terminating NUL.  That means that
>> after we copy that first character we only have room to insert one
>> padding space.  The way you had it coded, since we were expecting a
>> character 100 bytes long, we'd always try to insert 99 padding spaces.
>>
>
> do you speak about previous version?

Yes.

> in current version is garanted new length is <= 2x original length

Actually, strictly less than, but the code gets it correct. However,
your latest version has some other problems. For example, you didn't
update the docs to match your source-code changes. Also, I prefer an
API where the escaping function does include the quotes, so I've done
it that way in the attached patch. This is just the libpq changes, I
figure if we can agree on this, then we can move onto the psql stuff.

Comments?

...Robert

Attachment Content-Type Size
PQescapeIdentifierConn.patch text/x-patch 17.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2010-01-18 20:26:53 Re: quoting psql varible as identifier
Previous Message Tom Lane 2010-01-18 19:32:20 Re: review: More frame options in window functions