Quick Links

Re: Memory usage during sorting

From:	Jim Nasby <jim(at)nasby(dot)net>
To:	Greg Stark <stark(at)mit(dot)edu>
Cc:	Peter Geoghegan <peter(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Memory usage during sorting
Date:	2012-05-01 16:57:39
Message-ID:	4FA01603.80607@nasby.net
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On 4/17/12 7:19 AM, Greg Stark wrote:
> On Mon, Apr 16, 2012 at 10:42 PM, Peter Geoghegan<peter(at)2ndquadrant(dot)com> wrote:
>> > All but 4 regression tests pass, but they don't really count
>> > as failures, since they're down to an assumption in the tests that the
>> > order certain tuples appear should be the same as our current
>> > quicksort implementation returns them, even though, in these
>> > problematic cases, that is partially dictated by implementation - our
>> > quicksort isn't stable, but timsort is.
> This is an interesting point. If we use a stable sort we'll probably
> be stuck with stable sorts indefinitely. People will start depending
> on the stability and then we'll break their apps if we find a faster
> sort that isn't stable.

I have often wished that I could inject entropy into a test database to ferret out these kinds of issues. In particular I worry about things like users depending on specific values for serial types or depending on the order of data in the heap.

I would find it useful if Postgres had an option to intentionally inject more randomness in areas at the cost of some performance. IE: have nextval() burn through a small, random number of values before returning one, and have scan operators do some re-ordering of tuples where appropriate.

If we had such an option and encouraged users to use it in testing, it would reduce the risk of people depending on behavior that they shouldn't be.
--
Jim C. Nasby, Database Architect jim(at)nasby(dot)net
512.569.9461 (cell) http://jim.nasby.net

In response to

Re: Memory usage during sorting at 2012-04-17 12:19:21 from Greg Stark

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Tom Lane	2012-05-01 16:59:00	Re: Problem with multi-job pg_restore
Previous Message	Joey Adams	2012-05-01 16:56:20	Re: JSON in 9.2 - Could we have just one to_json() function instead of two separate versions ?