Re: Locale agnostic unicode text

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Greg Stark <gsstark(at)mit(dot)edu>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Locale agnostic unicode text
Date: 2005-01-24 17:06:54
Message-ID: 22403.1106586414@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Greg Stark <gsstark(at)mit(dot)edu> writes:
> On Sat, 22 Jan 2005 17:09:42 -0500, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> I would imagine that the performance is spectacularly awful :-(.
>> Have you benchmarked it? A large sort on a unitext column,
>> for instance, would be revealing.

> Why do you persist in believing this? I sent timing results of doing a
> setlocale for every record here about a year ago. Sorting on the pg_strxfrm I
> posted (and Conway rewrote) was about twice as slow as sorting without using
> it. So it's slow but not spectacularly awful.

glibc is not the world. I tried Dawid's functions on Mac OS X, being a
random non-glibc platform that I happen to use. Using some text data
I had handy (44500 lines, 1.9MB) I made a single-column text table and
timed
explain analyze select * from foo order by f1;
The results were
In C locale, SQL_ASCII encoding: 820 ms
In C locale, UNICODE encoding: 825 ms
Using Dawid's functions: 62010 ms
Stripped-down functions: 21010 ms

The "stripped down" functions were the same functions without the
locale overhead, eg

CREATE OR REPLACE FUNCTION unitext_le(unitext,unitext) RETURNS boolean AS $$
my $ret = ($_[0] le $_[1]) ? 't' : 'f';
return $ret;
$$ LANGUAGE plperlu STABLE;

so we may conclude that about one-third of the overhead is plperl's
fault and the other two-thirds is setlocale's fault. But it's still
a factor of 50 slowdown to do it this way (actually worse, since not
all of the EXPLAIN ANALYZE total runtime went into sorting).

I'm not sure what your threshold of "spectacularly awful" is, but that
meets mine.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2005-01-24 17:10:18 Re: Extending System Views: proposal for 8.1/8.2
Previous Message Peter Eisentraut 2005-01-24 16:27:28 Re: Extending System Views: proposal for 8.1/8.2