Re: qsort again (was Re: [PERFORM] Strange Create Index

From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: Ron <rjpeace(at)earthlink(dot)net>
Cc: "Steinar H(dot) Gunderson" <sgunderson(at)bigfoot(dot)com>, pgsql-performance(at)postgresql(dot)org, pgsql-hackers(at)postgresql(dot)org
Subject: Re: qsort again (was Re: [PERFORM] Strange Create Index
Date: 2006-02-16 14:48:33
Message-ID: 20060216144833.GG26127@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-performance

On Thu, Feb 16, 2006 at 08:22:55AM -0500, Ron wrote:
> 3= Especially in modern systems where the gap between internal CPU
> bandwidth and memory bandwidth is so great, the overhead of memory
> accesses for comparisons and moves is the majority of the overhead
> for both the pivot choosing and the partitioning algorithms within
> quicksort. Particularly random memory accesses. The reason (#GPRs -
> 1) is a magic constant is that it's the most you can compare and move
> using only register-to-register operations.

But how much of this applies to us? We're not sorting arrays of
integers, we're sorting pointers to tuples. So while moves cost very
little, a comparison costs hundreds, maybe thousands of cycles. A tuple
can easily be two or three cachelines and you're probably going to
access all of it, not to mention the Fmgr structures and the Datums
themselves.

None of this is cache friendly. The actual tuples themselves could be
spread all over memory (I don't think any particular effort is expended
trying to minimize fragmentation).

Do these algorithms discuss the case where a comparison is more than
1000 times the cost of a move?

Where this does become interesting is where we can convert a datum to
an integer such that if f(A) > f(B) then A > B. Then we can sort on
f(X) first with just integer comparisons and then do a full tuple
comparison only if f(A) = f(B). This would be much more cache-coherent
and make these algorithms much more applicable in my mind.

Have a nice day,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2006-02-16 14:48:54 Re: [HACKERS] Patch Submission Guidelines
Previous Message Tom Lane 2006-02-16 14:42:40 Re: [HACKERS] qsort again (was Re: Strange Create Index behaviour)

Browse pgsql-performance by date

  From Date Subject
Next Message Gary Doades 2006-02-16 15:42:36 Re: [HACKERS] qsort again (was Re: Strange Create Index
Previous Message Tom Lane 2006-02-16 14:42:40 Re: [HACKERS] qsort again (was Re: Strange Create Index behaviour)