Re: [HACKERS] multi-byte aware char_length() etc.

From: "Thomas G(dot) Lockhart" <lockhart(at)alumni(dot)caltech(dot)edu>
To: t-ishii(at)sra(dot)co(dot)jp
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: [HACKERS] multi-byte aware char_length() etc.
Date: 1998-03-19 05:50:40
Message-ID: 3510B230.4A815C51@alumni.caltech.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> I'm planning to modify some string functions so that they would be
> aware of multi-byte strings if compiled with the multi-byte
> capability. Followings are files I'm going to modify. I would like to
> hear your opinions if you have any.
>
> o character_length()
>
> It seems that the function is implemented as textlen() in
> utils/adt/varlena.c or as varcharlen() in varchar.c. Current
> implementaion returns an octet length rather than a char length. So I
> will change them. However, there might be necessity for getting an
> octet length in some applications. Maybe this is a good chance to add
> SQL92's octet_length().

Yes.

> o lower()/upper()
>
> Implemented in oracle_compat.c. One thing I have noticed is that it
> uses toupper()/tolower(). For ASCII, they are fine. But on some
> platforms (I guess SysV) they might have some problems:
>
> char c; /* c is an 8-bit letter and this platform uses char as
> signed char */
> toupper(c); /* may cause segfault or any other bad thing */
>
> So I will change like:
>
> toupper((unsigned char)c);

I would like to move these routines, as you clean them up, to varlena.c
or whatever Postgres-specific source file is appropriate. Let's leave
oracle_compat.c for non-standard, Oracle-specific functions. Perhaps
eventually we can move any of those which remain to the contrib
directory, assuming that there are good equivalent functions available
in SQL92.

Sort of annoying having oracle_compat when Oracle doesn't return the
favor by having a "postgres_compat". Well, maybe DataBlades are the same
thing?? :)

> o position()
>
> Implemented as textpos() in varlena.c.
>
> o substring()
>
> Implemented as text_substr() in varlena.c.

These two are OK. I'm not yet clear on where in the parser these varlena
functions are matched up with both text and varchar() types. We may need
to do something different as we keep working on getting the
text/varchar/char behavior improved.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Doug Lo 1998-03-19 06:15:39 Tix + Postgres.
Previous Message t-ishii 1998-03-19 05:34:08 Re: [HACKERS] First mega-patch...