Quick Links

Re: improve Chinese locale performance

From:	Martijn van Oosterhout <kleptog(at)svana(dot)org>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Craig Ringer <craig(at)2ndquadrant(dot)com>, Greg Stark <stark(at)mit(dot)edu>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Quan Zongliang <quanzongliang(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: improve Chinese locale performance
Date:	2013-07-28 09:39:40
Message-ID:	20130728093940.GA5652@svana.org
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Tue, Jul 23, 2013 at 10:34:21AM -0400, Robert Haas wrote:
> I pretty much lost interest in ICU upon reading that they use UTF-16
> as their internal format.
>
> http://userguide.icu-project.org/strings#TOC-Strings-in-ICU

The UTF-8 support has been steadily improving:

For example, icu::Collator::compareUTF8() compares two UTF-8 strings
incrementally, without converting all of the two strings to UTF-16 if
there is an early base letter difference.

http://userguide.icu-project.org/strings/utf-8

For all other encodings you should be able to use an iterator. As to
performance I have no idea.

The main issue with strxfrm() is its lame API. If it supported
returning prefixes you'd be set, but as it is you need >10MB of memory
just to transform a 10MB string, even if only the first few characers
would be enough to sort...

Mvg,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> He who writes carelessly confesses thereby at the very outset that he does
> not attach much importance to his own thoughts.
-- Arthur Schopenhauer

In response to

Re: improve Chinese locale performance at 2013-07-23 14:34:21 from Robert Haas

Responses

Re: improve Chinese locale performance at 2013-08-01 16:09:45 from Robert Haas

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Marko Tiikkaja	2013-07-28 10:23:23	Re: replication_reserved_connections
Previous Message	Atri Sharma	2013-07-28 06:51:55	Re: replication_reserved_connections