Re: Hash support for grouping sets

From: Mark Dilger <hornschnorter(at)gmail(dot)com>
To: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
Cc: Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Hash support for grouping sets
Date: 2017-03-23 18:46:39
Message-ID: 3D7C328C-ADA8-4026-A613-710D25B317AD@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


> On Mar 23, 2017, at 11:22 AM, Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk> wrote:
>
>>>>>> "Mark" == Mark Dilger <hornschnorter(at)gmail(dot)com> writes:
>
> Mark> You define DiscreteKnapsack to take integer weights and double
> Mark> values, and perform the usual Dynamic Programming algorithm to
> Mark> solve. But the only place you call this, you pass in NULL for
> Mark> the values, indicating that all the values are equal. I'm
> Mark> confused why in this degenerate case you bother with the DP at
> Mark> all. Can't you just load the knapsack from lightest to heaviest
> Mark> after sorting, reducing the runtime complexity to O(nlogn)? It's
> Mark> been a long day. Sorry if I am missing something obvious.
>
> That's actually a very good question.
>
> (Though in passing I should point out that the runtime cost is going to
> be negligible in all practical cases. Even an 8-way cube (256 grouping
> sets) has only 70 rollups, and a 4-way cube has only 6; the limit of
> 4096 grouping sets was only chosen because it was a nice round number
> that was larger than comparable limits in other database products. Other
> stuff in the grouping sets code has worse time bounds; the
> bipartite-match code used to minimize the number of rollups is
> potentially O(n^2.5) in the number of grouping sets.)
>
> Thinking about it, it's not at all obvious what we should be optimizing
> for here in the absence of accurate sort costs.

Is there a performance test case where this patch should shine
brightest? I'd like to load a schema with lots of data, and run
a grouping sets query, both before and after applying the patch,
to see what the performance advantage is.

mark

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavan Deolasee 2017-03-23 18:47:08 Re: Patch: Write Amplification Reduction Method (WARM)
Previous Message Andres Freund 2017-03-23 18:42:43 Re: Hash support for grouping sets