Luke Lonergan wrote:
> Is there a good source of multi-byte copy data test cases? What is
> currently done to test the trans-coding support? (where client and server
> encodings are different)
> I notice that the regression data in the CVS version of postgres does not
> seem to include cases other than the ASCII data, is there another source of
> data/cases we're missing?
> Also - Alon's looking into this, but it would appear that the presumption on
> EOL for two-byte encodings is 0x0a+0xNN, where 0x0a is followed by any byte.
> Similar for other current control characters (escape, delimiter). Is there
> a definition of format and semantics for COPY with 2-byte encodings we
> should look at?
> I've looked at the code and the docs like sql-copy.html and the question is
> relevant because of the following case:
> if newline were defined as 0x0a+0x00 as opposed to 0x0a+0xNN where N is
> arbitrary, we could parse using 16-bit logic.
> if newline were defined as 0x0a+0xNN, we must use byte-wise parsing
We have two and three-byte encodings, so 16-bit seems like it wouldn't
work. I am not aware of any specs except the C code itself.
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073
In response to
pgsql-hackers by date
|Next:||From: Luke Lonergan||Date: 2005-06-03 21:31:04|
|Subject: Re: NOLOGGING option, or ?|
|Previous:||From: Neeraj Tharwani||Date: 2005-06-03 21:13:36|
|Subject: Regarding large objects!|