Re: Partition-wise aggregation/grouping

From: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
To: Jeevan Chalke <jeevan(dot)chalke(at)enterprisedb(dot)com>
Cc: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, Rajkumar Raghuwanshi <rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Partition-wise aggregation/grouping
Date: 2017-10-13 07:43:45
Message-ID: CAKJS1f-TPaUFH0v4ap=p35B2iZ-XHB9fnY_FLvq=DvDHmSDabw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 13 October 2017 at 19:36, Jeevan Chalke
<jeevan(dot)chalke(at)enterprisedb(dot)com> wrote:
> I have tried exactly same tests to get to this factor on my local developer
> machine. And with parallelism enabled I got this number as 7.9. However, if
> I disable the parallelism (and I believe David too disabled that), I get
> this number as 1.8. Whereas for 10000 rows, I get this number to 1.7
>
> -- With Gather
> # select current_Setting('cpu_tuple_cost')::float8 / ((10633.56 * (81.035 /
> 72.450) - 10633.56) / 1000000);
> 7.9
>
> -- Without Gather
> # select current_Setting('cpu_tuple_cost')::float8 / ((16925.01 * (172.838 /
> 131.400) - 16925.01) / 1000000);
> 1.8
>
> -- With 10000 rows (so no Gather too)
> # select current_Setting('cpu_tuple_cost')::float8 / ((170.01 * (1.919 /
> 1.424) - 170.01) / 10000);
> 1.7
>
> So it is not so straight forward to come up the correct heuristic here. Thus
> using 50% of cpu_tuple_cost look good to me here.
>
> As suggested by Ashutosh and Robert, attached separate small WIP patch for
> it.

Good to see it stays fairly consistent at different tuple counts, and
is not too far away from what I got on this machine.

I looked over the patch and saw this:

@@ -1800,6 +1827,9 @@ cost_merge_append(Path *path, PlannerInfo *root,
*/
run_cost += cpu_operator_cost * tuples;

+ /* Add MergeAppend node overhead like we do it for the Append node */
+ run_cost += cpu_tuple_cost * DEFAULT_APPEND_COST_FACTOR * tuples;
+
path->startup_cost = startup_cost + input_startup_cost;
path->total_cost = startup_cost + run_cost + input_total_cost;
}

You're doing that right after a comment that says we don't do that. It
also does look like the "run_cost += cpu_operator_cost * tuples" is
trying to do the same thing, so perhaps it's worth just replacing
that, which by default will double that additional cost, although
doing so would have the planner slightly prefer a MergeAppend to an
Append than previously.

+#define DEFAULT_APPEND_COST_FACTOR 0.5

I don't really think the DEFAULT_APPEND_COST_FACTOR adds much. it
means very little by itself. It also seems that most of the other cost
functions just use the magic number.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message KES 2017-10-13 08:13:43 Re: Fwd: [BUGS] BUG #14850: Implement optinal additinal parameter for 'justify' date/time function
Previous Message Andres Freund 2017-10-13 07:39:14 Re: pgsql: Improve performance of SendRowDescriptionMessage.