Re: cost_sort() improvements

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Teodor Sigaev <teodor(at)sigaev(dot)ru>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: cost_sort() improvements
Date: 2018-07-08 23:22:15
Message-ID: 8c672f6e-778d-6664-8009-10751fe2dbd8@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

I'll do more experiments/tests next week, but let me share at least some
initial thoughts ...

On 06/28/2018 06:47 PM, Teodor Sigaev wrote:
> Hi!
>
> Current estimation of sort cost has following issues:
>  - it doesn't differ one and many columns sort
>  - it doesn't pay attention to comparison function cost and column width
>  - it doesn't try to count number of calls of comparison function on per
> column
>    basis
>
> At least [1] and [2] hit into to that issues and have an
> objections/questions about correctness of cost sort estimation.
> Suggested patch tries to improve current estimation and solve that issues.
>

For those not following the two patches ("GROUP BY optimization" [1] and
"Incremental sort" [2]): This wasn't a major issue so far, because we
are not reordering the grouping keys to make the sort cheaper (which is
what [1] does), nor do we ignore some of the sort keys (which is what
[2] does, pretty much). Both patches need to estimate number of
comparisons on different columns and/or size of groups (i.e. ndistinct).

> Basic ideas:
>  - let me introduce some designations
>     i   - index of column, started with 1
>     N   - number of tuples to sort
>     Gi  - number of groups, calculated by i number columns. Obviously,
>           G0 == 1. Number of groups is caculated by estimate_num_groups().
>     NGi - number of tuples in one group. NG0 == N and NGi = N/G(i-1)
>     Fi  - cost of comparison function call of i-th column

OK, so Fi is pretty much whatever CREATE FUNCTION ... COST says, right?

>     Wi  - average width in bytes of 1-ith column.
>     Ci  - total cost of one comparison call of column calculated as
>           Fi * fine_function(Wi) where fine_function(Wi) looks like:
>           if (Wi <= sizeof(Datum))
>               return 1.0; //any type passed in Datum
>           else
>               return 1 + log16(Wi/sizeof(Datum));
>           log16 just an empirical choice
>  - So, cost for first column is 2.0 * C0 * log2(N)
>    First column is always compared, multiplier 2.0 is defined per
> regression
>    test. Seems, it estimates a cost for transferring tuple to CPU cache,
>    starting cost for unpacking tuple, cost of call qsort compare wrapper
>    function, etc. Removing this multiplier causes too optimistic
> prediction of
>    cost.

Hmm, makes sense. But doesn't that mean it's mostly a fixed per-tuple
cost, not directly related to the comparison? For example, why should it
be multiplied by C0? That is, if I create a very expensive comparator
(say, with cost 100), why should it increase the cost for transferring
the tuple to CPU cache, unpacking it, etc.?

I'd say those costs are rather independent of the function cost, and
remain rather fixed, no matter what the function cost is.

Perhaps you haven't noticed that, because the default funcCost is 1?

>  - cost for i-th columns:
>    Ci * log2(NGi) => Ci * log2(N/G(i-1))

OK. So essentially for each column we take log2(tuples_per_group), as
total number of comparisons in quick-sort is O(N * log2(N)).

>    So, here I believe, that i-th comparison function will be called only
> inside
>    group which number is defined by (i-1) leading columns. Note, following
>    discussion [1] I add multiplier 1.5 here to be closer to worst case when
>    groups are significantly differ. Right now there is no way to
> determine how
>    differ they could be. Note, this formula describes cost of first
> column too
>    except difference with multiplier.

The number of new magic constants introduced by this patch is somewhat
annoying. 2.0, 1.5, 0.125, ... :-(

>  - Final cost is cpu_operator_cost * N * sum(per column costs described
> above).
>    Note, for single column with width <= sizeof(datum) and F1 = 1 this
> formula
>    gives exactly the same result as current one.
>  - for Top-N sort empiric is close to old one: use 2.0 multiplier as
> constant
>    under log2, and use log2(Min(NGi, output_tuples)) for second and
> following
>    columns.
>

I think compute_cpu_sort_cost is somewhat confused whether
per_tuple_cost is directly a cost, or a coefficient that will be
multiplied with cpu_operator_cost to get the actual cost.

At the beginning it does this:

per_tuple_cost = comparison_cost;

so it inherits the value passed to cost_sort(), which is supposed to be
cost. But then it does the work, which includes things like this:

per_tuple_cost += 2.0 * funcCost * LOG2(tuples);

where funcCost is pretty much pg_proc.procost. AFAIK that's meant to be
a value in units of cpu_operator_cost. And at the end it does this

per_tuple_cost *= cpu_operator_cost;

I.e. it gets multiplied with another cost. That doesn't seem right.

For most cost_sort calls this may not matter as the comparison_cost is
set to 0.0, but plan_cluster_use_sort() sets this explicitly to:

comparisonCost
= 2.0 * (indexExprCost.startup + indexExprCost.per_tuple);

which is likely to cause issues.

Also, why do we need this?

if (sortop != InvalidOid)
{
Oid funcOid = get_opcode(sortop);

funcCost = get_func_cost(funcOid);
}

Presumably if we get to costing sort, we know the pathkey can be sorted
and has sort operator, no? Or am I missing something?

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2018-07-08 23:59:32 "Write amplification" is made worse by "getting tired" while inserting into nbtree secondary indexes (Was: Why B-Tree suffix truncation matters)
Previous Message Tomas Vondra 2018-07-08 21:59:07 Re: cost_sort() improvements