Re: client side syntax error localisation for psql (v1)

From: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
To: Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>
Cc: PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: client side syntax error localisation for psql (v1)
Date: 2004-03-12 12:57:08
Message-ID: Pine.GSO.4.58.0403121345270.19051@elvis
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


Dear Tatsuo,

> > > 1) a character is not always represented on a terminal propotional to
> > > the storage size. For example a kanji character in UTF-8 encoding
> > > has a storage size of 3 bytes while it occupies spaces only twice
> > > of ASCII characters on a terminal. Same thing can be said to LATIN
> > > 2,3 etc. in UTF-8 perhaps.
> >
> > I thought I dealt with that in the code by calling PQmblen for every char.
> > Am I wrong ?
>
> PQmblen returns the storage size, which is not necessarily same as the
> character width reprensented in a terminal. For example for a kanji
> character in UTF-8 PQmblen returns 3, but it ocuppies 2 x ASCII
> character space, not x 3. Isn't that a problem for you?

If I read you correctly, you mean that 1 character may take 3 bytes
of storage in the string, but it is not guaranteed to be 1 character
from the terminal perspective... Argh, that's definitely an issue:-(

I assumed that one character whatever the encoding would be 1 character
on the display.

If it is not the case, I think I can put/compute this information in the
translation structures that is use by PQmblen, and implement a
PQmbtermlen function...

Maybe you could point me some source of information about display lengths
of characters depending on the encoding?

> > What I mean by "ASCII compatible" is that spaces, new lines, carriage
> > returns, tabs and NULL (C string terminaison) are one byte characters.
> > This assumption seemed pretty safe to me.
>
> I think you can do it safely using PQmblen.

Ok, what you describe is basically what I've done with the qidx
computation as suggested by Tom Lane and then later I check that the
encoded length is one to find my special characters.

Thanks for you reply,

--
Fabien Coelho - coelho(at)cri(dot)ensmp(dot)fr

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Treat 2004-03-12 13:11:24 Re: pgFoundry
Previous Message Tatsuo Ishii 2004-03-12 12:35:58 Re: client side syntax error localisation for psql (v1)