Re: Planning counters in pg_stat_statements (using pgss_store)

From: Julien Rouhaud <rjuju123(at)gmail(dot)com>
To: "imai(dot)yoshikazu(at)fujitsu(dot)com" <imai(dot)yoshikazu(at)fujitsu(dot)com>
Cc: "tomas(dot)vondra(at)2ndquadrant(dot)com" <tomas(dot)vondra(at)2ndquadrant(dot)com>, legrand legrand <legrand_legrand(at)hotmail(dot)com>, "sk(at)zsrv(dot)org" <sk(at)zsrv(dot)org>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Imai Yoshikazu <yoshikazu_i443(at)live(dot)jp>
Subject: Re: Planning counters in pg_stat_statements (using pgss_store)
Date: 2019-11-13 10:49:39
Message-ID: CAOBaU_bzrbAO2amK55AQF3=ivgX6YjOBzZ4bBiHKf_faLK5-Gg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Nov 12, 2019 at 5:41 AM imai(dot)yoshikazu(at)fujitsu(dot)com
<imai(dot)yoshikazu(at)fujitsu(dot)com> wrote:
>
> On Sat, Nov 9, 2019 at 1:36 PM, Julien Rouhaud wrote:
> >
> > I attach v3 patches implementing those counters.
>
> Thanks for updating the patch! Now I can see min/max/mean/stddev plan time.
>
>
> > Note that to avoid duplicating some code (related to Welford's method),
> > I switched some of the counters to arrays rather than scalar variables. It unfortunately makes
> > pg_stat_statements_internal() a little bit messy, but I hope that it's still acceptable.
>
> Yeah, I also think it's acceptable, but I think the codes like below one is more
> understandable than using for loop in pg_stat_statements_internal(),
> although some codes will be duplicated.
>
> pg_stat_statements_internal():
>
> if (api_version >= PGSS_V1_8)
> {
> kind = PGSS_PLAN;
> values[i++] = Int64GetDatumFast(tmp.calls[kind]);
> values[i++] = Float8GetDatumFast(tmp.total_time[kind]);
> values[i++] = Float8GetDatumFast(tmp.min_time[kind]);
> values[i++] = Float8GetDatumFast(tmp.max_time[kind]);
> values[i++] = Float8GetDatumFast(tmp.mean_time[kind]);
> values[i++] = Float8GetDatumFast(stddev(tmp));
> }
>
> kind = PGSS_EXEC;
> values[i++] = Int64GetDatumFast(tmp.calls[kind]);
> values[i++] = Float8GetDatumFast(tmp.total_time[kind]);
> if (api_version >= PGSS_V1_3)
> {
> values[i++] = Float8GetDatumFast(tmp.min_time[kind]);
> values[i++] = Float8GetDatumFast(tmp.max_time[kind]);
> values[i++] = Float8GetDatumFast(tmp.mean_time[kind]);
> values[i++] = Float8GetDatumFast(stddev(tmp));
> }
>
>
> stddev(Counters counters)
> {
> /*
> * Note we are calculating the population variance here, not the
> * sample variance, as we have data for the whole population, so
> * Bessel's correction is not used, and we don't divide by
> * tmp.calls - 1.
> */
> if (counters.calls[kind] > 1)
> return stddev = sqrt(counters.sum_var_time[kind] / counters.calls[kind]);
>
> return 0.0;
> }

Yes, that's also a possibility (though this should also take the
"kind" as parameter). I wanted to avoid adding a new function and
save some redundant code, but I can change it in the next version of
the patch if needed.

> > While doing this refactoring
> > I saw that previous patches were failing to accumulate the buffers used during planning, which is now fixed.
>
> Checked.
> Now buffer usage columns include buffer usage during planning and executing,
> but if we turn off track_planning, buffer usage during planning is also not
> tracked which is good for users who don't want to take into account of that.

Indeed. Note that there's a similar discussion on adding planning
buffer counters to explain in [1]. I'm unsure if merging planning and
execution counters in the same columns is ok or not.

> What I'm concerned about is column names will not be backward-compatible.
> {total, min, max, mean, stddev}_{plan, exec}_time are the best names which
> correctly show the meaning of its value, but we can't use
> {total, min, max, mean, stddev}_time anymore and it might be harmful for
> some users.
> I don't come up with any good idea for that...

Well, perhaps keeping the old {total, min, max, mean, stddev}_time
would be ok, as those historically meant "execution". I don't have a
strong opinion here.

[1] https://www.postgresql.org/message-id/20191112205506.rvadbx2dnku3paaw@alap3.anarazel.de

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2019-11-13 10:49:58 Re: BUG #16109: Postgres planning time is high across version - 10.6 vs 10.10
Previous Message Julien Rouhaud 2019-11-13 10:39:04 Re: BUG #16109: Postgres planning time is high across version - 10.6 vs 10.10