Re: Double sorting split patch

From: Peter Geoghegan <peter(at)2ndquadrant(dot)com>
To: Greg Stark <stark(at)mit(dot)edu>
Cc: Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Double sorting split patch
Date: 2011-09-18 13:11:41
Message-ID: CAEYLb_XMnANbHKhxCdTKAp40vTic99v30sbE=UFBenCZayjuiA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 18 September 2011 01:51, Greg Stark <stark(at)mit(dot)edu> wrote:
> I think we provided the qsort implementation for the benefit of
> platforms that didn't have a decent one and then ended up switching to
> use it always for some reason I don't recall.  It wasn't because we
> thought we were better at writing qsort implementations than OS
> vendors.

The comment:

* We have modified their original by adding a check for already-sorted input,
* which seems to be a win per discussions on pgsql-hackers around 2006-03-21.

sort of suggests that it *was* at least in part because we thought we
could do a better job, but that isn't a significant detail.

>> I wondered in passing if compiler vendors had got around to figuring
>> out a way of solving the "inlining function pointer" problem that I
>> first read about years ago, so I ran this code, which benchmarks the
>> macro-based qsort above:
>
> You might need -fipa-pta or some other option. Or maybe LLVM would
> have a better chance of pulling this off. Compilers are usually pretty
> loath to generate specializations for every call site for fear of
> bloating the code.

Compilers do a fairly good job of generating the specialisations
appropriately, which leads to the observation that the inline keyword
is kind of at the wrong level of granularity, as individual call sites
are inlined, not functions.

I built this program with a recent build of clang from SVN (left over
from my work on Clang a few weeks back, as it happens) on a more
powerful machine, and the difference narrowed:

C quick-sort time elapsed: 1.89
Inline C quick-sort time elapsed: 1.54

This still seems significant though.

> In any case I don't see how you're going to inline random database
> functions. Those are the cases where large amounts of data are being
> sorted. It's possible we sort small sets of data for internal reasons
> very frequently but I don't recall it turning up at the top of the
> profiles posted in recent days.

Well, this point was raised in relation to the OP's patch, which does
have a couple of simple comparators that look like good candidates for
inlining. There are other really simple comparators around like that -
one that I'm aware of off-hand is the one used to sort
pg_stat_statements entries to find victims. It doesn't have to have
major benefits to be worth undertaking - it only has to be worth the
effort of its original implementation and ongoing maintenance.

> Now a JIT that actually inlined random database functions into qsort
> and optimized the result would be pretty cool. But it doesn't seem
> like it's going to happen tomorrow...

Yeah, that would be extremely cool. I'd guess that the main reason it
isn't going to happen tomorrow is not so much technical, but because
there would be about a dozen controversies involved in integrating an
available, suitable JIT - how does it integrate with the build system,
licensing, choice of implementation language, support on less popular
platforms, how do package managers handle the dependency, can we trust
the developers/steering committee of the JIT (Okay, so I'm really
thinking about LLVM here), and so on. That's just off the top of my
head.

--
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2011-09-18 14:08:05 Re: Is there really no interest in SQL Standard?
Previous Message Kerem Kat 2011-09-18 09:39:30 Adding CORRESPONDING to Set Operations