From: | Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr> |
---|---|
To: | Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp> |
Cc: | PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: client side syntax error localisation for psql (v1) |
Date: | 2004-03-12 12:57:08 |
Message-ID: | Pine.GSO.4.58.0403121345270.19051@elvis |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Dear Tatsuo,
> > > 1) a character is not always represented on a terminal propotional to
> > > the storage size. For example a kanji character in UTF-8 encoding
> > > has a storage size of 3 bytes while it occupies spaces only twice
> > > of ASCII characters on a terminal. Same thing can be said to LATIN
> > > 2,3 etc. in UTF-8 perhaps.
> >
> > I thought I dealt with that in the code by calling PQmblen for every char.
> > Am I wrong ?
>
> PQmblen returns the storage size, which is not necessarily same as the
> character width reprensented in a terminal. For example for a kanji
> character in UTF-8 PQmblen returns 3, but it ocuppies 2 x ASCII
> character space, not x 3. Isn't that a problem for you?
If I read you correctly, you mean that 1 character may take 3 bytes
of storage in the string, but it is not guaranteed to be 1 character
from the terminal perspective... Argh, that's definitely an issue:-(
I assumed that one character whatever the encoding would be 1 character
on the display.
If it is not the case, I think I can put/compute this information in the
translation structures that is use by PQmblen, and implement a
PQmbtermlen function...
Maybe you could point me some source of information about display lengths
of characters depending on the encoding?
> > What I mean by "ASCII compatible" is that spaces, new lines, carriage
> > returns, tabs and NULL (C string terminaison) are one byte characters.
> > This assumption seemed pretty safe to me.
>
> I think you can do it safely using PQmblen.
Ok, what you describe is basically what I've done with the qidx
computation as suggested by Tom Lane and then later I check that the
encoded length is one to find my special characters.
Thanks for you reply,
--
Fabien Coelho - coelho(at)cri(dot)ensmp(dot)fr
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Treat | 2004-03-12 13:11:24 | Re: pgFoundry |
Previous Message | Tatsuo Ishii | 2004-03-12 12:35:58 | Re: client side syntax error localisation for psql (v1) |