From: | Giles Lean <giles(at)nemeton(dot)com(dot)au> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | "Randall Parker" <randall(at)nls(dot)net>, "PostgreSQL-Dev" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: An idea on faster CHAR field indexing |
Date: | 2000-06-22 08:47:43 |
Message-ID: | 12346.961663663@nemeton.com.au |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
> Interesting. That certainly suggests strxfrm could be a loser for
> a database index too, but I agree it'd be nice to see some actual
> measurements rather than speculation.
>
> What locale(s) were you using when testing your sort code? I suspect
> the answers might depend on locale quite a bit...
I did a little more measurement today. It's still only annecdotal
evidence -- I wasn't terribly rigorous -- but here are my results.
My data file consisted of ~660,000 lines and a total size of ~200MB.
Each line had part descriptions in German and some uninteresting
fields. I stripped out the uninteresting fields and read the file
calling calling strxfrm() for each line. I recorded the total input
bytes and the total bytes returned by strxfrm().
HP-UX 11.00 de_DE.roman8 locale:
input bytes: 179647811
result bytes: 1447833496 (increase factor 8.05)
Solaris 2.6 de_CH locale:
input bytes: 179647811
result bytes: 1085875122 (increase factor 6.04)
I didn't time the test program on Solaris, but on HP-UX this program
took longer to run than a simplistic qsort() using strcoll() does, and
my comparison sort program has to write the data out as well, which
the strxfrm() calling program didn't do.
Regards,
Giles
From | Date | Subject | |
---|---|---|---|
Next Message | Hiroshi Inoue | 2000-06-22 09:07:18 | RE: Big 7.1 open items |
Previous Message | Philip J. Warner | 2000-06-22 07:50:15 | Re: Big 7.1 open items |