Re: Progress on fast path sorting, btree index creation time

From: Peter Geoghegan <peter(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jay Levitt <jay(dot)levitt(at)gmail(dot)com>, "Jim Decibel! Nasby" <decibel(at)decibel(dot)org>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Progress on fast path sorting, btree index creation time
Date: 2012-02-09 00:54:36
Message-ID: CAEYLb_UK-BqSbKhoRUr_bfx9SRq4tH0rW97r2X7oAgAOsc+6eg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 8 February 2012 23:33, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Wed, Feb 8, 2012 at 1:48 PM, Peter Geoghegan <peter(at)2ndquadrant(dot)com> wrote:
>> That was clear from an early stage, and is something that I
>> acknowledged way back in September
>
> OK, so why didn't/don't we do and commit that part first, and then
> proceed to argue about the remainder once it's in?

I have no objection to that. I'm not sure that it's going to be
possible to agree to anything beyond what has already been agreed. We
don't seem to be covering new ground.

What would it take to convince you to be more inclusive, and have at
least a few full specialisations, in particular, int4 and int8 single
key full specialisations? I'm sure that you'll agree that they're the
second most compelling case. There doesn't seem to be a standard that
I can meet that you can even /describe/ to have those additional
specialisations committed, which is a shame, since they appear to be
pretty decent wins above and beyond the generic single key
specialisation, all things considered.

I suppose that the standard refrain here is "it's your patch; you must
prove the case for committing it". That would be fine by me, but this
isn't just about me, and it seems to be a practical impossibility in
this case. It may be quite impossible even with far greater resources
than I can afford to apply here, since it seems like you're hinting at
some kind of rigorous proof that this cannot cause a regression for a
single client, even though in order to see that regression multiple
other clients would have to be benefiting. I cannot provide such a
proof, since it would probably have to consider all commercially
available CPUs and all possible workloads - I doubt that anyone can.
That being the case, I'm eager not to waste any more time on this. I
bear no ill will; that just seems to be the situation that we find
ourselves in.

That said, if you can describe the standard, I'll try and meet it -
you seem to be suggesting that months and months of benchmarks
wouldn't really change things, since this is the first step on the
road to binary bloat ruin. The only fundamentally new argument that I
can offer is that by applying the standard that the community does
here, it severely limits the number of performance improvements that
can be added, and the aggregate effect is that we're quite certainly
worse off. It's a pity this isn't a really easy decision for you to
make, but that's the nature of these things, and we should expect that
to increasingly be the case. Is it really so unacceptable for us to
risk doing a little worse under some rare circumstances in order to do
better under some common circumstances? Why should it be an easy
decision - when are important decisions ever easy?

>> I think that there may be additional benefits from making the
>> qsort_arg specialisation look less like a c stdlib one, like refining
>> the swap logic to have compile-time knowledge of the type it is
>> sorting. I'm thinking that we could usefully trim quite a bit from
>> this:
>
> That's an interesting idea, which seems worth pursuing, though
> possibly not for 9.2.

Well, it's really just a code clean-up. I suspect that qsort_arg is a
minimally modified version of the NetBSD one that wasn't necessarily
thoroughly understood when it was initially added (not that I'm
suggesting that it ought to have been). Then again, you might prefer
to keep it as consistent as possible with qsort (the other copy of
that sort function that we already have).

--
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Joachim Wieland 2012-02-09 00:56:50 Re: patch for parallel pg_dump
Previous Message Dan Scales 2012-02-09 00:00:29 Re: double writes using "double-write buffer" approach [WIP]