On Wed, Sep 14, 2005 at 04:20:13PM -0700, Mark Richardson wrote:
> If I have a table defined...
> CREATE TABLE myTable (myByteaField bytea NOT NULL);
> then populate it...
> COPY myTable (myByteaField) FROM stdin;
Bytea values should be escaped with two backslashes, not one. See
"Binary Data Types" in the documentation:
> SELECT length(myByteaField) FROM stdin;
Please post the actual command you ran; the above fails with 'relation
"stdin" does not exist.' Presumably you really queried myTable (this
might seem like nitpicking, but sometimes little differences matter).
> I'm assuming that this is because of the length function (seeing
> the '\0' causes strlen to be shorter than the actual data).
Actually it's because the data wasn't fully loaded due to the
incorrectly-escaped \000 characters:
The length function doesn't have a problem with \000 characters:
If you look at the implementation of the length(bytea) function
(byteaoctetlen() in src/backend/utils/adt/varlena.c) you'll see
that it doesn't read the data at all; it simply reads a size
attribute that's stored with the data.
> So I try to use the libpq interface, and when I execute the code
> PQgetLength(pgresult, 0, 0)
> the result is 0 (for record 0, field )
Again, please post what you actually did; the above should have
failed to link with an error like "undefined reference to `PQgetLength'"
(the actual function name is PQgetlength).
> PQgetLength(pgresult, 2, 0)
> The result is 1 (for record 2, field 0). So this leads me to
> believe that the values are stored in the database as a string, and
> they are being terminated during the copy (I also tried inserts,
> with the same results).
The values are stored as a sequence of bytes, and they are indeed
being truncated during the COPY because the data was incorrectly
> On a last ditch effort, I used ECPG with the following code...
> EXEC SQL BEGIN DECLARE SECTION;
> char myByteaField_;
> EXEC SQL END DECLARE SECTION;
> EXEC SQL SELECT myByteaField INTO :myByteaField_ FROM myTable;
> for (int i=0; i<3; i++)
> for (int j=0; j<8; j++)
> std::cout << myByteaField_[i][j] << "(" << int(myByteaField_[i][j] << ").";
> std::cout << std::endl;
> I get the following....
> (0).(0).(0).(0).(0).(0).(0).(0). <- this is expected
The myByteaField_ array might have been zeroed to begin with, and
you're probably seeing that data, not data filled in from the query.
What happens if you fill the array with a character like '*' before
doing the SELECT? If I do that with the data input as in your
example, I see the first character as \0 and the rest of the data
as the garbage in the array:
> \(92).0(48).1(49).1(49).A(65).\(92).2(50).0(48) <- why escape sequence?
Non-printable characters are escaped unless you request the results
in binary; see the documentation link posted earlier.
> \(92).0(48).1(49).2(50).(0).(0).(0).(0) <- dropped the last 6 bytes
> So I'm pretty sure that the data is stored as a string, and it
> is being terminated at the '\0'.
Bytea and other variable-length types are stored as a length attribute
and then the data itself. See the discussion of TOAST in the
> ECPG not only has the same behavior, but the conversion of the
> data to a char array seems incorrect (maybe the ECPG pre-compiler
> is thinking that I want to_char() of the value of the byte array).
Unless you request binary results, queries return text representations
of the data; for bytea that means that non-printable characters will
> So my question is, how would you be able to store any data in a
> bytea if it contained any bytes with the value of 0 (I'm sure there
> are images stored as bytea and they have 0 byte values). I don't
> need to store an image, just these 8 bytes.
Escape the data correctly with two backslashes; the "Binary Data
Types" documentation explains why this is necessary.
In response to
pgsql-interfaces by date
|Next:||From: Mark Richardson||Date: 2005-09-15 19:18:05|
|Subject: Re: libpq, ecpg and the bytea data type|
|Previous:||From: Tom Lane||Date: 2005-09-15 02:49:53|
|Subject: Re: Beta of DBD::Pg 1.44 available |