Quick Links

Randomisation for ensuring nlogn complexity in quicksort

From:	Atri Sharma <atri(dot)jiit(at)gmail(dot)com>
To:	PgHacker <pgsql-hackers(at)postgresql(dot)org>
Subject:	Randomisation for ensuring nlogn complexity in quicksort
Date:	2013-06-30 12:30:20
Message-ID:	896779CC-C2BD-420D-8BB8-F45B0DAAF2BF@gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Hi all,

I have been reading the recent discussion and was researching a bit, and I think that we should really go with the idea of randomising the input data(if it is not completely presorted), to ensure that we do not get quadratic complexity.

One easy way to do that could be to take a sample of the data set, and take a pivot out of it. Still a better way could be to take multiple samples which are spread of the data set, select a value from each of them, and then take a cumulative pivot(median,maybe).

Anyways, I really think that if we do not go with the above ideas, then, we should some how factor in the degree of randomness of the input data when making the decision between quicksort and external merge sort for a set of rows.

This shouldn't be too complex, and should give us a fixed nlogn complexity even for wild data sets, without affecting existing normal data sets that are present in every day transactions. I even believe that those data sets will also benefit from the above optimisation.

Thoughts/Comments?

Regards,
Atri

Sent from my iPad

Responses

Re: Randomisation for ensuring nlogn complexity in quicksort at 2013-07-01 11:42:04 from jasmine
Re: Randomisation for ensuring nlogn complexity in quicksort at 2013-07-01 19:32:27 from Robert Haas

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Martijn van Oosterhout	2013-06-30 12:31:46	Re: plpython implementation
Previous Message	Szymon Guz	2013-06-30 12:18:07	Re: plpython implementation