Re: psql display of Unicode combining characters in 8.2

From: Michael Fuhr <mike(at)fuhr(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Martijn van Oosterhout <kleptog(at)svana(dot)org>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: psql display of Unicode combining characters in 8.2
Date: 2006-12-10 17:57:12
Message-ID: 20061210175712.GA41610@winnie.fuhr.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Dec 10, 2006 at 12:30:12PM -0500, Tom Lane wrote:
> Martijn van Oosterhout <kleptog(at)svana(dot)org> writes:
> > On Sat, Dec 09, 2006 at 10:50:05PM -0700, Michael Fuhr wrote:
> >> Should the code distinguish between combining characters and
> >> zero-width control characters so the former display correctly?
>
> > Probably, any idea how to tell the difference?
>
> I'm no expert, but isn't there a specific range of Unicode code points
> defined for combining characters?

Yes, several, with others scattered about. Could we use the general
category (Mn = Mark, nonspacing; Me = Mark, enclosing)? ucs_wcwidth()
in src/backend/utils/mb/wchar.c already contains some of that
knowledge, doesn't it? The combining[] list looks incomplete but
otherwise close to what we'd need.

--
Michael Fuhr

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jim C. Nasby 2006-12-10 19:16:44 Re: Grouped Index Tuples
Previous Message Tom Lane 2006-12-10 17:30:12 Re: psql display of Unicode combining characters in 8.2