Skip site navigation (1) Skip section navigation (2)

Re: [PERFORM] A Better External Sort?

From: "Dann Corbit" <DCorbit(at)connx(dot)com>
To: "PFC" <lists(at)boutiquenumerique(dot)com>, <rjpeace(at)earthlink(dot)net>
Cc: "Pg Hackers" <pgsql-hackers(at)postgresql(dot)org>,<pgsql-performance(at)postgresql(dot)org>
Subject: Re: [PERFORM] A Better External Sort?
Date: 2005-09-30 17:44:34
Message-ID: (view raw, whole thread or download thread mbox)
Lists: pgsql-hackerspgsql-performance
> -----Original Message-----
> From: pgsql-hackers-owner(at)postgresql(dot)org [mailto:pgsql-hackers-
> owner(at)postgresql(dot)org] On Behalf Of PFC
> Sent: Thursday, September 29, 2005 9:10 AM
> To: rjpeace(at)earthlink(dot)net
> Cc: Pg Hackers; pgsql-performance(at)postgresql(dot)org
> Subject: Re: [HACKERS] [PERFORM] A Better External Sort?
> 	Just to add a little anarchy in your nice debate...
> 	Who really needs all the results of a sort on your terabyte
table ?

Reports with ORDER BY/GROUP BY, and many other possibilities.  40% of
mainframe CPU cycles are spent sorting.  That is because huge volumes of
data require lots of energy to be meaningfully categorized.  Let's
suppose that instead of a terabyte of data (or a petabyte or whatever)
we have 10% of it.  That's still a lot of data.

> 	I guess not many people do a SELECT from such a table and want
> the
> results. 

What happens when they do?  The cases where it is already fast are not
very important.  The cases where things go into the crapper are the ones
that need attention.

>So, this leaves :
> 	- Really wanting all the results, to fetch using a cursor,
> 	- CLUSTER type things, where you really want everything in
> 	- Aggregates (Sort->GroupAggregate), which might really need to
> the
> whole table.
> 	- Complex queries where the whole dataset needs to be examined,
> order
> to return a few values
> 	- Joins (again, the whole table is probably not going to be
> selected)
> 	- And the ones I forgot.
> 	However,
> 	Most likely you only want to SELECT N rows, in some ordering :
> 	- the first N (ORDER BY x LIMIT N)
> 	- last N (ORDER BY x DESC LIMIT N)

For these, the QuickSelect algorithm is what is wanted.  For example:
#include <stdlib.h>
typedef double  Etype;

extern Etype    RandomSelect(Etype * A, size_t p, size_t r, size_t i);
extern size_t   RandRange(size_t a, size_t b);
extern size_t   RandomPartition(Etype * A, size_t p, size_t r);
extern size_t   Partition(Etype * A, size_t p, size_t r);

** In the following code, every reference to CLR means:
**    "Introduction to Algorithms"
**    By Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest
**    ISBN 0-07-013143-0

** CLR, page 187
Etype           RandomSelect(Etype A[], size_t p, size_t r, size_t i)
    size_t          q,
    if (p == r)
        return A[p];
    q = RandomPartition(A, p, r);
    k = q - p + 1;

    if (i <= k)
        return RandomSelect(A, p, q, i);
        return RandomSelect(A, q + 1, r, i - k);

size_t          RandRange(size_t a, size_t b)
    size_t          c = (size_t) ((double) rand() / ((double) RAND_MAX +
1) * (b - a));
    return c + a;

** CLR, page 162
size_t          RandomPartition(Etype A[], size_t p, size_t r)
    size_t          i = RandRange(p, r);
    Etype           Temp;
    Temp = A[p];
    A[p] = A[i];
    A[i] = Temp;
    return Partition(A, p, r);

** CLR, page 154
size_t          Partition(Etype A[], size_t p, size_t r)
    Etype           x,
    size_t          i,

    x = A[p];
    i = p - 1;
    j = r + 1;

    for (;;) {
        do {
        } while (!(A[j] <= x));
        do {
        } while (!(A[i] >= x));
        if (i < j) {
            temp = A[i];
            A[i] = A[j];
            A[j] = temp;
        } else
            return j;

> 	- WHERE x>value ORDER BY x LIMIT N
> 	- and other variants
> 	Or, you are doing a Merge JOIN against some other table ; in
> case,
> yes, you might need the whole sorted terabyte table, but most likely
> are WHERE clauses in the query that restrict the set, and thus, maybe
> can get some conditions or limit values on the column to sort.

Where clause filters are to be applied AFTER the join operations,
according to the SQL standard.

> 	Also the new, optimized hash join, which is more memory
> might
> cover this case.

For == joins.  Not every order by is applied to joins.  And not every
join is an equal join.

> 	Point is, sometimes, you only need part of the results of your
> And
> the bigger the sort, the most likely it becomes that you only want
part of
> the results.

That is an assumption that will sometimes be true, and sometimes not.
It is not possible to predict usage patterns for a general purpose
database system.

> So, while we're in the fun hand-waving, new algorithm trying
> mode, why not consider this right from the start ? (I know I'm totally
> hand-waving mode right now, so slap me if needed).
> 	I'd say your new, fancy sort algorithm needs a few more input
> :
> 	- Range of values that must appear in the final result of the
sort :
> 		none, minimum, maximum, both, or even a set of values
from the
> other
> side of the join, hashed, or sorted.

That will already happen (or it certainly ought to)

> 	- LIMIT information (first N, last N, none)

That will already happen (or it certainly ought to -- I would be pretty
surprised if it does not happen)

> 	- Enhanced Limit information (first/last N values of the second
> column to
> sort, for each value of the first column) (the infamous "top10 by
> category" query)
> 	- etc.

All the filters will (at some point) be applied to the data unless they
cannot be applied to the data by formal rule.
> 	With this, the amount of data that needs to be kept in memory is
> dramatically reduced, from the whole table (even using your compressed
> keys, that's big) to something more manageable which will be closer to
> size of the final result set which will be returned to the client, and
> avoid a lot of effort.

Sorting the minimal set is a good idea.  Sometimes there is a big
savings there. I would be pretty surprised if a large fraction of data
that does not have to be included is actually processed during the
> 	So, this would not be useful in all cases, but when it applies,
> would
> be really useful.

No argument there.  And if an algorithm is being reworked, it is a good
idea to look at things like filtering to see if all filtering that is
allowed by the language standard before the sort takes place is applied.

pgsql-performance by date

Next:From: Jim C. NasbyDate: 2005-09-30 19:34:59
Subject: Re: database bloat, but vacuums are done, and fsm seems to be setup ok
Previous:From: Josh BerkusDate: 2005-09-30 17:23:30
Subject: Re: [PERFORM] A Better External Sort?

pgsql-hackers by date

Next:From: Joshua D. DrakeDate: 2005-09-30 17:53:22
Subject: Re: Found small issue with OUT params
Previous:From: Tony CadutoDate: 2005-09-30 17:41:48
Subject: Re: Found small issue with OUT params

Privacy Policy | About PostgreSQL
Copyright © 1996-2017 The PostgreSQL Global Development Group