From: | Teodor Sigaev <teodor(at)sigaev(dot)ru> |
---|---|
To: | Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: cost_sort() improvements |
Date: | 2018-07-12 14:42:29 |
Message-ID: | ce8eff53-52f2-e7e6-0059-8527c3f2892d@sigaev.ru |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
> OK, so Fi is pretty much whatever CREATE FUNCTION ... COST says, right?
exactly
> Hmm, makes sense. But doesn't that mean it's mostly a fixed per-tuple
> cost, not directly related to the comparison? For example, why should it
> be multiplied by C0? That is, if I create a very expensive comparator
> (say, with cost 100), why should it increase the cost for transferring
> the tuple to CPU cache, unpacking it, etc.?
>
> I'd say those costs are rather independent of the function cost, and
> remain rather fixed, no matter what the function cost is.
>
> Perhaps you haven't noticed that, because the default funcCost is 1?
May be, but see my email
https://www.postgresql.org/message-id/ee14392b-d753-10ce-f5ed-7b2f7e277512%40sigaev.ru
about additional term proportional to N
> The number of new magic constants introduced by this patch is somewhat
> annoying. 2.0, 1.5, 0.125, ... :-(
2.0 is removed in last patch, 1.5 leaved and could be removed when I understand
you letter with group size estimation :)
0.125 should be checked, and I suppose we couldn't remove it at all because it
"average over whole word" constant.
>
>> - Final cost is cpu_operator_cost * N * sum(per column costs described
>> above).
>> Note, for single column with width <= sizeof(datum) and F1 = 1 this
>> formula
>> gives exactly the same result as current one.
>> - for Top-N sort empiric is close to old one: use 2.0 multiplier as
>> constant
>> under log2, and use log2(Min(NGi, output_tuples)) for second and
>> following
>> columns.
>>
>
> I think compute_cpu_sort_cost is somewhat confused whether
> per_tuple_cost is directly a cost, or a coefficient that will be
> multiplied with cpu_operator_cost to get the actual cost.
>
> At the beginning it does this:
>
> per_tuple_cost = comparison_cost;
>
> so it inherits the value passed to cost_sort(), which is supposed to be
> cost. But then it does the work, which includes things like this:
>
> per_tuple_cost += 2.0 * funcCost * LOG2(tuples);
>
> where funcCost is pretty much pg_proc.procost. AFAIK that's meant to be
> a value in units of cpu_operator_cost. And at the end it does this
>
> per_tuple_cost *= cpu_operator_cost;
>
> I.e. it gets multiplied with another cost. That doesn't seem right.
Huh, you are right, will fix in v8.
> Also, why do we need this?
>
> if (sortop != InvalidOid)
> {
> Oid funcOid = get_opcode(sortop);
>
> funcCost = get_func_cost(funcOid);
> }
Safety first :). Will remove.
--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/
From | Date | Subject | |
---|---|---|---|
Next Message | Teodor Sigaev | 2018-07-12 14:48:21 | Re: cost_sort() improvements |
Previous Message | Tom Lane | 2018-07-12 14:38:15 | Re: _isnan() on Windows |