| From: | Mischa <mischa(dot)Sandberg(at)telus(dot)net> |
|---|---|
| To: | pgsql-hackers(at)postgresql(dot)org |
| Subject: | Re: Group-count estimation statistics |
| Date: | 2005-01-29 10:10:22 |
| Message-ID: | 1106993422.41fb610e555cf@webmail.telus.net |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
> From: Sailesh Krishnamurthy <sailesh(at)cs(dot)berkeley(dot)edu>
> >>>>> "Tom" == Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:
>
> Tom> The only real solution, of course, is to acquire cross-column
> Tom> statistics, but I don't see that happening in the near
> Tom> future.
>
> Another approach is a hybrid hashing scheme where we use a hash table
> until we run out of memory at which time we start spilling to disk. In
> other words, no longer use SortAgg at all ..
>
> Under what circumstances will a SortAgg consumer more IOs than a
> hybrid hash strategy ?
Goetz Graefe did a heck of a lot of analysis of this, prior to his being snapped
up by Microsoft. He also worked out a lot of the nitty-gritty for hybrid hash
algorithms, extending the Grace hash for spill-to-disk, and adding a kind of
recursion for really huge sets. The figures say that hybrid hash beats
sort-aggregate, across the board.
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Victor Y. Yegorov | 2005-01-29 11:56:12 | Implementing Bitmap Indexes |
| Previous Message | Hans-Jürgen Schönig | 2005-01-29 09:39:01 | Re: some linker troubles with rc5 on sun studio 9 ... |