Skip site navigation (1) Skip section navigation (2)

Re: Index creation time and distribution

From: Matthew Wakeling <matthew(at)flymine(dot)org>
To: pgsql-performance <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Index creation time and distribution
Date: 2008-05-22 14:10:06
Message-ID: (view raw, whole thread or download thread mbox)
Lists: pgsql-performance
On Thu, 22 May 2008, Tom Lane wrote:
> Do you have maintenance_work_mem set large enough that the index
> creation sort is done in-memory?  8.1 depends on the platform's qsort
> and a lot of them are kinda pessimal for input like this.

Looking at the fact that other indexes on the same table are created 
quickly, it seems that the maintenance_work_mem isn't the issue - the sort 
algorithm is.

Having lots of elements the same value is a worst-case-scenario for a 
naive quicksort. I am in the middle of testing sorting algorithms for a 
performance lecture I'm going to give, and one of the best algorithms I 
have seen yet is that used in Java's java.util.Arrays.sort(). I haven't 
been able to beat it with any other comparison sort yet (although I have 
beaten it with a bucket sort, but I wouldn't recommend such an algorithm 
for a database).

From the JavaDoc:

> The sorting algorithm is a tuned quicksort, adapted from Jon L. Bentley 
> and M. Douglas McIlroy's "Engineering a Sort Function", 
> Software-Practice and Experience, Vol. 23(11) P. 1249-1265 (November 
> 1993). This algorithm offers n*log(n) performance on many data sets that 
> cause other quicksorts to degrade to quadratic performance.


First law of computing:  Anything can go wro
sig: Segmentation fault.  core dumped.

In response to

pgsql-performance by date

Next:From: Scott MarloweDate: 2008-05-22 16:50:33
Subject: Re: Index creation time and distribution
Previous:From: Guillaume SmetDate: 2008-05-22 13:38:25
Subject: Re: Index creation time and distribution

Privacy Policy | About PostgreSQL
Copyright © 1996-2017 The PostgreSQL Global Development Group