RE: Planning counters in pg_stat_statements (using pgss_store)

From: "imai(dot)yoshikazu(at)fujitsu(dot)com" <imai(dot)yoshikazu(at)fujitsu(dot)com>
To: 'Julien Rouhaud' <rjuju123(at)gmail(dot)com>
Cc: "tomas(dot)vondra(at)2ndquadrant(dot)com" <tomas(dot)vondra(at)2ndquadrant(dot)com>, legrand legrand <legrand_legrand(at)hotmail(dot)com>, "sk(at)zsrv(dot)org" <sk(at)zsrv(dot)org>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Imai Yoshikazu <yoshikazu_i443(at)live(dot)jp>
Subject: RE: Planning counters in pg_stat_statements (using pgss_store)
Date: 2019-11-12 04:41:54
Message-ID: OSBPR01MB46163FDEF109B96C60A7989294770@OSBPR01MB4616.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Nov 9, 2019 at 1:36 PM, Julien Rouhaud wrote:
> On Fri, Nov 8, 2019 at 3:31 PM Julien Rouhaud <rjuju123(at)gmail(dot)com> wrote:
> >
> > On Fri, Nov 8, 2019 at 5:35 AM imai(dot)yoshikazu(at)fujitsu(dot)com
> > <imai(dot)yoshikazu(at)fujitsu(dot)com> wrote:
> > >
> > > On Tue, Sept 10, 2019 at 11:27 PM, Julien Rouhaud wrote:
> > > > > [0002 patch]
> > > > > In pgss_planner_hook:
> > > > >
> > > > > + /* calc differences of buffer counters. */
> > > > > + bufusage =
> > > > > + compute_buffer_counters(bufusage_start, pgBufferUsage);
> > > > > +
> > > > > + /*
> > > > > + * we only store planning duration, query text has been initialized
> > > > > + * during previous pgss_post_parse_analyze as it not available inside
> > > > > + * pgss_planner_hook.
> > > > > + */
> > > > > + pgss_store(query_text,
> > > > >
> > > > > Do we need to calculate bufusage in here?
> > > > > We only store planning duration in the following pgss_store.
> > > >
> > > > Good point! Postgres can definitely access some buffers while
> > > > planning a query (the most obvious example would be
> > > > get_actual_variable_range()), but as far as I can tell those were
> > > > previously not accounted for with pg_stat_statements as
> > > > queryDesc->totaltime->bufusage is only accumulating buffer usage
> > > > queryDesc->totaltime->in
> > > > the executor, and indeed current patch also ignore such computed
> > > > counters.
> > > >
> > > > I think it would be better to keep this bufusage calculation
> > > > during planning and fix pgss_store() to process them, but this
> > > > would add
> > > slightly more overhead.
> > >
> > > Sorry for my late reply.
> > > I think overhead would be trivial and we can include bufusage of
> > > planning from the POV of overhead, but yeah, it will be backward
> > > incompatibility if we include them.
> >
> > Ok, let's keep planning's bufusage then.
> >
> > > BTW, ISTM it is good for including {min,max,mean,stddev}_plan_time
> > > to pg_stat_statements. Generally plan_time would be almost the same
> > > time in each execution for the same query, but there are some
> > > exceptions. For example, if we use prepare statements which uses
> > > partition tables, time differs largely between creating a general plan and creating a custom plan.
> > >
> > > 1. Create partition table which has 1024 partitions.
> > > 2. Prepare select and update statements.
> > > sel) prepare sel(int) as select * from pt where a = $1;
> > > upd) prepare upd(int, int) as update pt set a = $2 where a = $1;
> > > 3. Execute each statement for 8 times.
> > > 3-1. Select from pg_stat_statements view after every execution.
> > > select query, plans, total_plan_time, calls, total_exec_time
> > > from pg_stat_statements where query like 'prepare%';
> > >
> > >
> > > Results of pg_stat_statements of sel) are
> > > query | plans | total_plan_time | calls | total_exec_time
> > > ---------------------------------------------------+-------+-----------------+-------+-----------------
> > > prepare sel(int) as select * from pt where a = $1 | 1 | 0.164361 | 1 | 0.004613
> > > prepare sel(int) as select * from pt where a = $1 | 2 | 0.27715500000000004 | 2 | 0.009447
> > > prepare sel(int) as select * from pt where a = $1 | 3 | 0.39100100000000004 | 3 | 0.014281
> > > prepare sel(int) as select * from pt where a = $1 | 4 | 0.504004 | 4 | 0.019265
> > > prepare sel(int) as select * from pt where a = $1 | 5 | 0.628242 | 5 | 0.024091
> > > prepare sel(int) as select * from pt where a = $1 | 7 | 24.213586000000003 | 6 | 0.029144
> > > prepare sel(int) as select * from pt where a = $1 | 8 | 24.368900000000004 | 7 | 0.034099
> > > prepare sel(int) as select * from pt where a = $1 | 9 | 24.527956000000003 | 8 | 0.046152
> > >
> > >
> > > Results of pg_stat_statements of upd) are
> > > prepare upd(int, int) as update pt set a = $2 where a = $1 | 1 | 0.280099 | 1 | 0.013138
> > > prepare upd(int, int) as update pt set a = $2 where a = $1 | 2 | 0.405416 | 2 | 0.01894
> > > prepare upd(int, int) as update pt set a = $2 where a = $1 | 3 | 0.532361 | 3 | 0.040716
> > > prepare upd(int, int) as update pt set a = $2 where a = $1 | 4 | 0.671445 | 4 | 0.046566
> > > prepare upd(int, int) as update pt set a = $2 where a = $1 | 5 | 0.798531 | 5 | 0.052729000000000005
> > > prepare upd(int, int) as update pt set a = $2 where a = $1 | 7 | 896.915458 | 6 | 0.05888600000000001
> > > prepare upd(int, int) as update pt set a = $2 where a = $1 | 8 | 897.043512 | 7 | 0.064446
> > > prepare upd(int, int) as update pt set a = $2 where a = $1 | 9 | 897.169711 | 8 | 0.070644
> > >
> > >
> > > How do you think about that?
> >
> > That's indeed a very valid point and something we should help user to
> > investigate. I'll submit an updated patch with support for
> > min/max/mean/stddev plan time shortly.
>
> I attach v3 patches implementing those counters.

Thanks for updating the patch! Now I can see min/max/mean/stddev plan time.

> Note that to avoid duplicating some code (related to Welford's method),
> I switched some of the counters to arrays rather than scalar variables. It unfortunately makes
> pg_stat_statements_internal() a little bit messy, but I hope that it's still acceptable.

Yeah, I also think it's acceptable, but I think the codes like below one is more
understandable than using for loop in pg_stat_statements_internal(),
although some codes will be duplicated.

pg_stat_statements_internal():

if (api_version >= PGSS_V1_8)
{
kind = PGSS_PLAN;
values[i++] = Int64GetDatumFast(tmp.calls[kind]);
values[i++] = Float8GetDatumFast(tmp.total_time[kind]);
values[i++] = Float8GetDatumFast(tmp.min_time[kind]);
values[i++] = Float8GetDatumFast(tmp.max_time[kind]);
values[i++] = Float8GetDatumFast(tmp.mean_time[kind]);
values[i++] = Float8GetDatumFast(stddev(tmp));
}

kind = PGSS_EXEC;
values[i++] = Int64GetDatumFast(tmp.calls[kind]);
values[i++] = Float8GetDatumFast(tmp.total_time[kind]);
if (api_version >= PGSS_V1_3)
{
values[i++] = Float8GetDatumFast(tmp.min_time[kind]);
values[i++] = Float8GetDatumFast(tmp.max_time[kind]);
values[i++] = Float8GetDatumFast(tmp.mean_time[kind]);
values[i++] = Float8GetDatumFast(stddev(tmp));
}

stddev(Counters counters)
{
/*
* Note we are calculating the population variance here, not the
* sample variance, as we have data for the whole population, so
* Bessel's correction is not used, and we don't divide by
* tmp.calls - 1.
*/
if (counters.calls[kind] > 1)
return stddev = sqrt(counters.sum_var_time[kind] / counters.calls[kind]);

return 0.0;
}

> While doing this refactoring
> I saw that previous patches were failing to accumulate the buffers used during planning, which is now fixed.

Checked.
Now buffer usage columns include buffer usage during planning and executing,
but if we turn off track_planning, buffer usage during planning is also not
tracked which is good for users who don't want to take into account of that.

What I'm concerned about is column names will not be backward-compatible.
{total, min, max, mean, stddev}_{plan, exec}_time are the best names which
correctly show the meaning of its value, but we can't use
{total, min, max, mean, stddev}_time anymore and it might be harmful for
some users.
I don't come up with any good idea for that...

--
Yoshikazu Imai

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2019-11-12 04:50:03 Re: [PATCH] use separate PartitionedRelOptions structure to store partitioned table options
Previous Message Michael Paquier 2019-11-12 04:31:34 Re: Attempt to consolidate reading of XLOG page