verbose cost estimate

From: Justin Pryzby <pryzby(at)telsasoft(dot)com>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: verbose cost estimate
Date: 2019-12-07 09:10:04
Message-ID: 20191207091004.GV2082@telsasoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Jeff said:
https://www.postgresql.org/message-id/CAMkU%3D1zBJNVo2DGYBgLJqpu8fyjCE_ys%2Bmsr6pOEoiwA7y5jrA%40mail.gmail.com
|What would I find very useful is a verbosity option to get the cost
|estimates expressed as a multiplier of each *_cost parameter, rather than
|just as a scalar.

I guess the goal is something like
EXPLAIN(COSTS, VERBOSE) -- or some new option?
..would show something like
Seq Scan on public.sites (cost=0.00..2.90 rows=160 width=107)
Total tosts: Seq page: 1.01 Random page: 1.23 CPU tuple: .05 CPU oper: .01
Startup cost: [...]

It seems to me that's "just" a matter of redefining Cost and fixing everything that breaks:

struct Cost {
double seq, rand;
double cpu_tuple, cpu_index_tuple, cpu_oper;
double parallel_setup; // This is probably always in startup_cost and never in run_cost
double parallel_tuple; // This is probably always in run_cost and never in startup_cost
double disable;
};

I'm perhaps 50% done with that - is there some agreement that's a desirable
goal and a good way to do it ?

To give an idea what I'm doing, there's a bunch of stuff like this:

- if (path1->startup_cost < path2->startup_cost)
+ if (cost_asscalar(&path1->startup_cost) < cost_asscalar(&path2->startup_cost))

- qual_arg_cost += index_qual_cost.startup + index_qual_cost.per_tuple;
+ cost_add2(&qual_arg_cost, &index_qual_cost.startup, &index_qual_cost.per_tuple);

- if (cost.per_tuple > 10 * cpu_operator_cost)
+ if (cost_isgt_scalar(&cost.per_tuple, 10 * cpu_operator_cost))

And a great deal of stuff like this:

- run_cost += cpu_run_cost;
+ cost_add(&run_cost, &cpu_run_cost);

/* tlist eval costs are paid per output row, not per tuple scanned */
- startup_cost += path->pathtarget->cost.startup;
- run_cost += path->pathtarget->cost.per_tuple * path->rows;
+ cost_add(&startup_cost, &path->pathtarget->cost.startup);
+ cost_add_mul(&run_cost, &path->pathtarget->cost.per_tuple, path->rows);

path->startup_cost = startup_cost;
- path->total_cost = startup_cost + run_cost;
+ cost_set_sum2(&path->total_cost, &startup_cost, &run_cost);

As I've written it, that's somewhat different from Jeff's suggestion, as all
the entries in my struct are in units of cost. That seems easier due to (for
example) per-tablespace IO costs.

I'd rather know sooner than later if there's a better way.

Justin

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Shinoda, Noriyoshi (PN Japan A&PS Delivery) 2019-12-07 13:23:46 psql small improvement patch
Previous Message Amit Kapila 2019-12-07 06:07:31 Re: logical decoding : exceeded maxAllocatedDescs for .spill files