Quick Links

Re: [PERFORM] A Better External Sort?

From:	"Dann Corbit" <DCorbit(at)connx(dot)com>
To:	"PFC" <lists(at)boutiquenumerique(dot)com>, <rjpeace(at)earthlink(dot)net>
Cc:	"Pg Hackers" <pgsql-hackers(at)postgresql(dot)org>, <pgsql-performance(at)postgresql(dot)org>
Subject:	Re: [PERFORM] A Better External Sort?
Date:	2005-09-30 17:44:34
Message-ID:	D425483C2C5C9F49B5B7A41F8944154757D133@postal.corporate.connx.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers pgsql-performance

> -----Original Message-----
> From: pgsql-hackers-owner(at)postgresql(dot)org [mailto:pgsql-hackers-
> owner(at)postgresql(dot)org] On Behalf Of PFC
> Sent: Thursday, September 29, 2005 9:10 AM
> To: rjpeace(at)earthlink(dot)net
> Cc: Pg Hackers; pgsql-performance(at)postgresql(dot)org
> Subject: Re: [HACKERS] [PERFORM] A Better External Sort?
>
>
> Just to add a little anarchy in your nice debate...
>
> Who really needs all the results of a sort on your terabyte
table ?

Reports with ORDER BY/GROUP BY, and many other possibilities. 40% of
mainframe CPU cycles are spent sorting. That is because huge volumes of
data require lots of energy to be meaningfully categorized. Let's
suppose that instead of a terabyte of data (or a petabyte or whatever)
we have 10% of it. That's still a lot of data.

> I guess not many people do a SELECT from such a table and want
all
> the
> results.

What happens when they do? The cases where it is already fast are not
very important. The cases where things go into the crapper are the ones
that need attention.

>So, this leaves :
> - Really wanting all the results, to fetch using a cursor,
> - CLUSTER type things, where you really want everything in
order,
> - Aggregates (Sort->GroupAggregate), which might really need to
sort
> the
> whole table.
> - Complex queries where the whole dataset needs to be examined,
in
> order
> to return a few values
> - Joins (again, the whole table is probably not going to be
> selected)
> - And the ones I forgot.
>
> However,
>
> Most likely you only want to SELECT N rows, in some ordering :
> - the first N (ORDER BY x LIMIT N)
> - last N (ORDER BY x DESC LIMIT N)

For these, the QuickSelect algorithm is what is wanted. For example:
#include <stdlib.h>
typedef double Etype;

extern Etype RandomSelect(Etype * A, size_t p, size_t r, size_t i);
extern size_t RandRange(size_t a, size_t b);
extern size_t RandomPartition(Etype * A, size_t p, size_t r);
extern size_t Partition(Etype * A, size_t p, size_t r);

/*
**
** In the following code, every reference to CLR means:
**
** "Introduction to Algorithms"
** By Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest
** ISBN 0-07-013143-0
*/

/*
** CLR, page 187
*/
Etype RandomSelect(Etype A[], size_t p, size_t r, size_t i)
{
size_t q,
k;
if (p == r)
return A[p];
q = RandomPartition(A, p, r);
k = q - p + 1;

if (i <= k)
return RandomSelect(A, p, q, i);
else
return RandomSelect(A, q + 1, r, i - k);
}

size_t RandRange(size_t a, size_t b)
{
size_t c = (size_t) ((double) rand() / ((double) RAND_MAX +
1) * (b - a));
return c + a;
}

/*
** CLR, page 162
*/
size_t RandomPartition(Etype A[], size_t p, size_t r)
{
size_t i = RandRange(p, r);
Etype Temp;
Temp = A[p];
A[p] = A[i];
A[i] = Temp;
return Partition(A, p, r);
}

/*
** CLR, page 154
*/
size_t Partition(Etype A[], size_t p, size_t r)
{
Etype x,
temp;
size_t i,
j;

x = A[p];
i = p - 1;
j = r + 1;

for (;;) {
do {
j--;
} while (!(A[j] <= x));
do {
i++;
} while (!(A[i] >= x));
if (i < j) {
temp = A[i];
A[i] = A[j];
A[j] = temp;
} else
return j;
}
}

> - WHERE x>value ORDER BY x LIMIT N
> - WHERE x<value ORDER BY x DESC LIMIT N
> - and other variants
>
> Or, you are doing a Merge JOIN against some other table ; in
that
> case,
> yes, you might need the whole sorted terabyte table, but most likely
there
> are WHERE clauses in the query that restrict the set, and thus, maybe
we
> can get some conditions or limit values on the column to sort.

Where clause filters are to be applied AFTER the join operations,
according to the SQL standard.

> Also the new, optimized hash join, which is more memory
efficient,
> might
> cover this case.

For == joins. Not every order by is applied to joins. And not every
join is an equal join.

> Point is, sometimes, you only need part of the results of your
sort.
> And
> the bigger the sort, the most likely it becomes that you only want
part of
> the results.

That is an assumption that will sometimes be true, and sometimes not.
It is not possible to predict usage patterns for a general purpose
database system.

> So, while we're in the fun hand-waving, new algorithm trying
> mode, why not consider this right from the start ? (I know I'm totally
in
> hand-waving mode right now, so slap me if needed).
>
> I'd say your new, fancy sort algorithm needs a few more input
values
> :
>
> - Range of values that must appear in the final result of the
sort :
> none, minimum, maximum, both, or even a set of values
from the
> other
> side of the join, hashed, or sorted.

That will already happen (or it certainly ought to)

> - LIMIT information (first N, last N, none)

That will already happen (or it certainly ought to -- I would be pretty
surprised if it does not happen)

> - Enhanced Limit information (first/last N values of the second
> column to
> sort, for each value of the first column) (the infamous "top10 by
> category" query)
> - etc.

All the filters will (at some point) be applied to the data unless they
cannot be applied to the data by formal rule.

> With this, the amount of data that needs to be kept in memory is
> dramatically reduced, from the whole table (even using your compressed
> keys, that's big) to something more manageable which will be closer to
the
> size of the final result set which will be returned to the client, and
> avoid a lot of effort.

Sorting the minimal set is a good idea. Sometimes there is a big
savings there. I would be pretty surprised if a large fraction of data
that does not have to be included is actually processed during the
sorts.

> So, this would not be useful in all cases, but when it applies,
it
> would
> be really useful.

No argument there. And if an algorithm is being reworked, it is a good
idea to look at things like filtering to see if all filtering that is
allowed by the language standard before the sort takes place is applied.

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Joshua D. Drake	2005-09-30 17:53:22	Re: Found small issue with OUT params
Previous Message	Tony Caduto	2005-09-30 17:41:48	Re: Found small issue with OUT params

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Jim C. Nasby	2005-09-30 19:34:59	Re: database bloat, but vacuums are done, and fsm seems to be setup ok
Previous Message	Josh Berkus	2005-09-30 17:23:30	Re: [PERFORM] A Better External Sort?