Re: [Q] UTF-8 testing with Windows/ODBC 8.3.0400

From: "V S P" <toreason(at)fastmail(dot)fm>
To: "Craig Ringer" <craig(at)postnewspapers(dot)com(dot)au>
Cc: pgsql-odbc(at)postgresql(dot)org
Subject: Re: [Q] UTF-8 testing with Windows/ODBC 8.3.0400
Date: 2009-03-18 08:09:49
Message-ID: 1237363789.13407.1306005793@webmail.messagingengine.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-odbc


Hi thank you for the follow up.
Just had a break through...

I believe I was able to resolve most of the problems

I finally found a post on the net that says:
if you want UTF8 in your ODBC-based client program and Postgres is in
UTF8
then use the ASCII driver not the Unicode

So as soon as switched that things worked
(I also kept: set client_encoding='UTF' on my ODBC connections at
startup)

I used :

wchar_t wData[1000];
::MultiByteToWideChar(CP_UTF8, 0, my_normal_std_string.c_str(), -1,
wData, 1000);

to convert the read data and display in the debugger

Before the switch to the Ascii odbc driver, the above was just showing
question marks.

So reading UTF8 into ODBC programming using the Ascii driver works
perfect.

Since I now I understood what was going on, converted most of my strings
to wstrings, and then enabled Unicode Version of the PG ODBC driver --
and that works too !

:-) so now I have a #define where I switch between wstrings and strings
and of course a few other things, and then I flip the drivers in ODBC
datasource
and things work (I have tested selects sofar).

Three things that I am not still sure about, and may be you can help:

a) does Posgtres driver on unixODBC do the same as Windows (that is
there is a unicode and non unicode versions of the drivers ?
(I am interested in 64 bit linux and 64 bit freebsd ones)

b) I noticed that when using the Unicode version (first) and
Ascii version (second) the value of SWORD right before SQLVLEN is
different
(it is 12 on the ascii and -9 on unicode version) -- what does this
mean?

disp_otrq_x86d 8a4-b90 EXIT SQLDescribeColW with return code 0
(SQL_SUCCESS)
HSTMT 013F1BA8
UWORD 11
WCHAR * 0x01A28974 [ 9] "cntr_data"
SWORD 512
SWORD * 0x01A28BC4 (9)
SWORD * 0x01A28BB8 (-9)
SQLULEN * 0x01A28B94 (4096)
SWORD * 0x01A28BA0 (0)
SWORD * 0x01A28B88 (1)

disp_otrq_x86d ab8-498 EXIT SQLDescribeColW with return code 0
(SQL_SUCCESS)
HSTMT 013F1C38
UWORD 11
WCHAR * 0x01A28974 [ 9] "cntr_data"
SWORD 512
SWORD * 0x01A28BC4 (9)
SWORD * 0x01A28BB8 (12)
SQLULEN * 0x01A28B94 (4096)
SWORD * 0x01A28BA0 (0)
SWORD * 0x01A28B88 (1)

another question: I have about 6 tables where about 20 fields in each
table,
2 fields are 65K long (they are declared as varchar(65000) is this is OK
for ODBC drivers, and what if anything I should be setting on them?

Thank you again for your follow up,
Vlad

On Wed, 18 Mar 2009 15:46 +0900, "Craig Ringer"
<craig(at)postnewspapers(dot)com(dot)au> wrote:
> V S P wrote:
>
> > My C++ program relies on OTL C++ library to do DB access, and in the
> > Visual Studio debugger I see only question marks '?' for the strings.
>
> How would Visual Studio know that the std::string instances in question
> contain UTF-8 data? std::string is a byte string, not a character string
> - it could contain text in any encoding (or non-text data) and VC++ has
> no way of knowing how to interpret it.
>
> What it probably does is display anything within the ASCII range, and
> otherwise display ?s .
>
> If you expect to be able to work with those strings as real text, you
> probably want to use std::wstring instead, and USE APPROPRIATE ENCODING
> CONVERSION ROUTINES. Note that the width of wchar_t varies from platform
> to platform, so you'll need to convert to/from UTF-16 for a 2 byte
> wchar_t, or to/from UTF-32 for a 4-byte wchar_t.
>
> (I hate working with unicode and encodings in standard C++ *SO* much -
> argh! One of the only areas where I really wish I was using Java. If
> only the QString class from Qt was part of standard C++ ... ).
>
> > I am using std::string to store the bytestream from varchar column an I
> > think it is ok
> > because I do not need to 'manipulate' the content.
>
> True - but VC++ won't be able to understand what's in it, either.
>
> > I cannot figure out what else I might be doing wrong.... as I said, all
> > I need for now it is just to test out that a C++ program via ODBC can
> > get the data.
>
> Your description really isn't adequate to say. It's highly likely that
> you're retrieving the data from the database fine, but your tools don't
> know it's UTF-8 and aren't able to work with it correctly. That's mostly
> a guess with the amount of information you've provided, though.
>
> Perhaps you could post a small, self-contained test program and a SQL
> script to populate a test database? Then post the results of running the
> program against the database, including the hex values of the bytes
> returned by the ODBC interface.
>
> --
> Craig Ringer
--
V S P
toreason(at)fastmail(dot)fm

--
http://www.fastmail.fm - A no graphics, no pop-ups email service

In response to

Responses

Browse pgsql-odbc by date

  From Date Subject
Next Message Hiroshi Inoue 2009-03-19 12:28:33 Re: [Q] UTF-8 testing with Windows/ODBC 8.3.0400
Previous Message Craig Ringer 2009-03-18 06:46:54 Re: [Q] UTF-8 testing with Windows/ODBC 8.3.0400