Re: improve Chinese locale performance

From: Greg Stark <stark(at)mit(dot)edu>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Craig Ringer <craig(at)2ndquadrant(dot)com>, Quan Zongliang <quanzongliang(at)gmail(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: improve Chinese locale performance
Date: 2013-07-22 16:49:15
Message-ID: CAM-w4HMTTJDZPgn1RWm_aqs+mV7f86FvdQsCsayJ2QuC6i98nA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jul 22, 2013 at 12:50 PM, Peter Eisentraut <peter_e(at)gmx(dot)net> wrote:
> I think part of the problem is that we call strcoll for each comparison,
> instead of doing strxfrm once for each datum and then just strcmp for
> each comparison. That is effectively equivalent to what the proposal
> implements.

Fwiw I used to be a big proponent of using strxfrm. But upon further
analysis I realized it was a real difficult tradeoff. strxrfm saves
potentially a lot of cpu cost but at the expense of expanding the size
of the sort key. If the sort spills to disk or even if it's just
memory bandwidth limited it might actually be slower than doing the
additional cpu work of calling strcoll.

It's hard to see how to decide in advance which way will be faster. I
suspect strxfrm is still the better bet, especially for complex large
character set based locales like Chinese. strcoll might still win by a
large margin on simple mostly-ascii character sets.

--
greg

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit kapila 2013-07-22 16:58:28 Re: ALTER SYSTEM SET command to change postgresql.conf parameters (RE: Proposal for Allow postgresql.conf values to be changed via SQL [review])
Previous Message Alvaro Herrera 2013-07-22 16:44:17 Re: [PATCH] pgbench --throttle (submission 7 - with lag measurement)