Re: An idea on faster CHAR field indexing

From: Giles Lean <giles(at)nemeton(dot)com(dot)au>
To: "Randall Parker" <randall(at)nls(dot)net>
Cc: "PostgreSQL-Dev" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: An idea on faster CHAR field indexing
Date: 2000-06-21 20:59:06
Message-ID: 11449.961621146@nemeton.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


> So let me cut to the chase: I'm thinking that rather than store the
> actual character sequence of each field (or some subset of a field)
> in an index why not translate the characters into their collation
> sequence values and store _those_ in the index?

This is not an obvious win, since:

1. some collations rules require multiple passes over the data

2. POSIX strxfrm() will convert strings of characters to a form that
can be compared by strcmp() [i.e. single pass] but tends to greatly
increase memory requirements

I've only data for one implementation of strxfrm(), but the memory
usage startled me. In my application it was faster to use
strcoll() directly for collation than to pre-expand the data with
strxfrm().

Regards,

Giles

Browse pgsql-hackers by date

  From Date Subject
Next Message Randall Parker 2000-06-21 21:15:30 Re: An idea on faster CHAR field indexing
Previous Message Randall Parker 2000-06-21 20:14:58 An idea on faster CHAR field indexing