Re: Locale timings

From: Michael Tiemann <tiemann(at)redhat(dot)com>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: PostgreSQL Development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Locale timings
Date: 2001-11-26 19:41:31
Message-ID: 3C029AEB.2030902@redhat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

This is a common way of doing things inside glibc, and the happy result is that
if you really want to build a non-locale-aware system, you can use a
compile-time option that replaces the "locale_is_not_C" test with a constant.
It makes for more maintainable code because there's less chance for bitrot in
the usual case.

M

Peter Eisentraut wrote:

> I did some "benchmarks" to check whether --enable-locale with LC_ALL=C is
> just as fast as --disable-locale, to possibly justify making locale
> support the default. This test only covers locale-aware comparisons,
> which seems to be the critical aspect for all intents and purposes.
>
> I loaded a table of a single text column with 454240 rows of English
> words. The table had a size of 21.5 MB. The values were explicitly
> de-sorted, but the order was the same across all test runs. Then I ran
> SELECT * FROM test ORDER BY 1; and timed the wall-clock response time a
> few times. All configuration parameters were left at the default.
>
> The averaged results follow. Some logarithmic buffering cleverness
> appeared to surface, but the results are still distinct enough to be
> useful.
>
> no locale: 58s
> locale=C: 78s (ca. 33% slower)
> locale=en_US: 118s (ca. 100% slower)
>
> This confused me, because in my C library a strcoll() call with locale=C
> is handed to strcmp() quite directly. A look into varlena.c:varstr_cmp()
> shows that the locale-aware path does some extra copying because there is
> no strncoll() function we can use with non-terminated strings.
>
> For testing's sake I replaced the two palloc() calls in that function with
> alloca(), which is presumably the fastest possible memory allocator.
> Result:
>
> locale=C,alloca: 67s (ca. 15% slower)
>
> This shows that we're wasting quite a bit of time allocating memory --
> probably not only in this place. I'm pretty sure that the majority of the
> rest of the gap comes from the memcpy() operations. Not that there's a
> whole lot we can do about either of these things.
>
> However, I feel that we could reasonably cope with this situation by
> replacing
>
> #ifdef USE_LOCALE
> /* locale-aware code */
> #else
> /* non-locale code */
> #endif
>
> with
>
> if (locale_is_not_C)
> {
> /* locale-ware code */
> }
> else
> {
> /* non-locale code */
> }
>
> This practice should have minuscule impact, and it's probably the plan for
> the multibyte side of things as well.
>
>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2001-11-26 20:04:12 Re: grant/revoke bug with delete/update
Previous Message Bruce Momjian 2001-11-26 19:41:23 Re: Call for objections: deprecate postmaster -o switch?