Quick Links

Re: Print correct startup cost for the group aggregate.

From:	Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
To:	Rushabh Lathia <rushabh(dot)lathia(at)gmail(dot)com>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Print correct startup cost for the group aggregate.
Date:	2017-03-06 09:25:03
Message-ID:	CAFjFpRfvJEsStm3trC7h_GGDPC-YMqaXMjGUHo5e4QMMtYRr5g@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

>
>
> I understood you reasoning of why startup_cost = input_startup_cost and not
> input_total_cost for aggregation by sorting. But what I didn't understand is
> how come higher startup cost for aggregation by sorting would force hash
> aggregation to be chosen? I am not clear about this part.

See this comment in cost_agg()
* Note: in this cost model, AGG_SORTED and AGG_HASHED have exactly the
* same total CPU cost, but AGG_SORTED has lower startup cost. If the
* input path is already sorted appropriately, AGG_SORTED should be
* preferred (since it has no risk of memory overflow).

AFAIU, if the input is already sorted, aggregation by sorting and
aggregation by hashing will have almost same cost, the startup cost of
AGG_SORTED being lower than AGG_HASHED. Because of lower startup cost,
AGG_SORTED gets chosen by add_path() over AGG_HASHED path.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

In response to

Re: Print correct startup cost for the group aggregate. at 2017-03-06 09:16:59 from Rushabh Lathia

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Amit Langote	2017-03-06 09:41:02	Re: UPDATE of partition key
Previous Message	Kyotaro HORIGUCHI	2017-03-06 09:20:06	Re: Restricting maximum keep segments by repslots