pg_do_encoding_conversion glitch

From: ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
To: pgsql-hackers(at)postgresql(dot)org
Subject: pg_do_encoding_conversion glitch
Date: 2008-11-10 11:23:23
Message-ID: 20081110195144.7F35.52131E4D@oss.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I have a question about the result contract of pg_do_encoding_conversion().
It can receive non null-terminated string because its arguments are
a char array and a byte length.
And it only returns a string, so the string should be null-terminated.

However, if conversions are not required, the function returns
the input string itself even though it might be not null-terminated.

I checked usages of pg_do_encoding_conversion() and xml_parse()
could cause troubles. Is it a bug? needed to be fixed?

---- [utils/mb/mbutils.c]
unsigned char *
pg_do_encoding_conversion(unsigned char *src, int len,
int src_encoding, int dest_encoding)
{
...
if (src_encoding == dest_encoding)
return src;
----

---- [utils/adt/xml.c]
static xmlDocPtr
xml_parse(text *data, XmlOptionType xmloption_arg, bool preserve_whitespace,
xmlChar * encoding)
{
...
len = VARSIZE(data) - VARHDRSZ; /* will be useful later */
string = xml_text2xmlChar(data);

utf8string = pg_do_encoding_conversion(string,
len,
encoding ?
xmlChar_to_encoding(encoding) :
[It could be UTF8 to UTF8] --> GetDatabaseEncoding(),
PG_UTF8);
----

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2008-11-10 12:47:38 Re: Short CVS question
Previous Message Bernd Helmle 2008-11-10 10:51:48 Re: ALTER DATABASE SET TABLESPACE vs crash safety