Locale timings

From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: PostgreSQL Development <pgsql-hackers(at)postgresql(dot)org>
Subject: Locale timings
Date: 2001-11-26 19:15:33
Message-ID: Pine.LNX.4.30.0111261852030.612-100000@peter.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I did some "benchmarks" to check whether --enable-locale with LC_ALL=C is
just as fast as --disable-locale, to possibly justify making locale
support the default. This test only covers locale-aware comparisons,
which seems to be the critical aspect for all intents and purposes.

I loaded a table of a single text column with 454240 rows of English
words. The table had a size of 21.5 MB. The values were explicitly
de-sorted, but the order was the same across all test runs. Then I ran
SELECT * FROM test ORDER BY 1; and timed the wall-clock response time a
few times. All configuration parameters were left at the default.

The averaged results follow. Some logarithmic buffering cleverness
appeared to surface, but the results are still distinct enough to be
useful.

no locale: 58s
locale=C: 78s (ca. 33% slower)
locale=en_US: 118s (ca. 100% slower)

This confused me, because in my C library a strcoll() call with locale=C
is handed to strcmp() quite directly. A look into varlena.c:varstr_cmp()
shows that the locale-aware path does some extra copying because there is
no strncoll() function we can use with non-terminated strings.

For testing's sake I replaced the two palloc() calls in that function with
alloca(), which is presumably the fastest possible memory allocator.
Result:

locale=C,alloca: 67s (ca. 15% slower)

This shows that we're wasting quite a bit of time allocating memory --
probably not only in this place. I'm pretty sure that the majority of the
rest of the gap comes from the memcpy() operations. Not that there's a
whole lot we can do about either of these things.

However, I feel that we could reasonably cope with this situation by
replacing

#ifdef USE_LOCALE
/* locale-aware code */
#else
/* non-locale code */
#endif

with

if (locale_is_not_C)
{
/* locale-ware code */
}
else
{
/* non-locale code */
}

This practice should have minuscule impact, and it's probably the plan for
the multibyte side of things as well.

--
Peter Eisentraut peter_e(at)gmx(dot)net

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2001-11-26 19:17:29 Re: Call for objections: deprecate postmaster -o switch?
Previous Message Bruce Momjian 2001-11-26 19:15:11 Re: Call for objections: deprecate postmaster -o switch?