Re: Timsort performance, quicksort

From: Peter Geoghegan <peter(at)2ndquadrant(dot)com>
To: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>
Cc: PG Hackers <pgsql-hackers(at)postgresql(dot)org>, Greg S <stark(at)mit(dot)edu>
Subject: Re: Timsort performance, quicksort
Date: 2012-04-20 02:12:36
Message-ID: CAEYLb_VMvuibpW793rfMK+3eB=3B+aWrnfD1ZxKHg=MZiqL5JQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 19 April 2012 19:24, Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr> wrote:
> I kind of understood timsort would shine in sorting text in non-C
> collation, because of the comparison cost. So a test in some UTF8
> collation or other would be interesting, right?

That's certainly the theory, yes. In practice, even though timsort
lives up to its promise of significantly reducing the number of
comparisons required in many common situations, my implementation does
not actually perform better than qsort_arg. Even a reduction of over
10% in the number of comparisons does not result in a net performance
gain. It wouldn't surprise me if the implementation used is quite
suboptimal, and it might well be worth profiling and optimising. It
doesn't appear to be the big win that I'd hoped for though. It's
necessary to stretch the assumed cost of a comparison rather a lot
further than the very common case of sorting a single key of non-c
collated text for it to be worth it, and that's just too thin for me
to sink more time into this right now.

--
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2012-04-20 02:54:50 Re: Plan stability versus near-exact ties in cost estimates
Previous Message Jim Nasby 2012-04-19 23:53:53 Re: Plan stability versus near-exact ties in cost estimates