Re: Planning counters in pg_stat_statements (using pgss_store)

From: Julien Rouhaud <rjuju123(at)gmail(dot)com>
To: "imai(dot)yoshikazu(at)fujitsu(dot)com" <imai(dot)yoshikazu(at)fujitsu(dot)com>
Cc: "tomas(dot)vondra(at)2ndquadrant(dot)com" <tomas(dot)vondra(at)2ndquadrant(dot)com>, legrand legrand <legrand_legrand(at)hotmail(dot)com>, "sk(at)zsrv(dot)org" <sk(at)zsrv(dot)org>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Imai Yoshikazu <yoshikazu_i443(at)live(dot)jp>
Subject: Re: Planning counters in pg_stat_statements (using pgss_store)
Date: 2019-11-09 13:36:20
Message-ID: CAOBaU_Y12bn0tOdN9RMBZn29bfYYH11b2CwKO1RO7dX9fQ3aZA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Nov 8, 2019 at 3:31 PM Julien Rouhaud <rjuju123(at)gmail(dot)com> wrote:
>
> On Fri, Nov 8, 2019 at 5:35 AM imai(dot)yoshikazu(at)fujitsu(dot)com
> <imai(dot)yoshikazu(at)fujitsu(dot)com> wrote:
> >
> > On Tue, Sept 10, 2019 at 11:27 PM, Julien Rouhaud wrote:
> > > > [0002 patch]
> > > > In pgss_planner_hook:
> > > >
> > > > + /* calc differences of buffer counters. */
> > > > + bufusage = compute_buffer_counters(bufusage_start, pgBufferUsage);
> > > > +
> > > > + /*
> > > > + * we only store planning duration, query text has been initialized
> > > > + * during previous pgss_post_parse_analyze as it not available inside
> > > > + * pgss_planner_hook.
> > > > + */
> > > > + pgss_store(query_text,
> > > >
> > > > Do we need to calculate bufusage in here?
> > > > We only store planning duration in the following pgss_store.
> > >
> > > Good point! Postgres can definitely access some buffers while
> > > planning a query (the most obvious example would be
> > > get_actual_variable_range()), but as far as I can tell those were
> > > previously not accounted for with pg_stat_statements as
> > > queryDesc->totaltime->bufusage is only accumulating buffer usage in
> > > the executor, and indeed current patch also ignore such computed
> > > counters.
> > >
> > > I think it would be better to keep this bufusage calculation during
> > > planning and fix pgss_store() to process them, but this would add
> > slightly more overhead.
> >
> > Sorry for my late reply.
> > I think overhead would be trivial and we can include bufusage of planning from
> > the POV of overhead, but yeah, it will be backward incompatibility if we
> > include them.
>
> Ok, let's keep planning's bufusage then.
>
> > BTW, ISTM it is good for including {min,max,mean,stddev}_plan_time to
> > pg_stat_statements. Generally plan_time would be almost the same time in each
> > execution for the same query, but there are some exceptions. For example, if we
> > use prepare statements which uses partition tables, time differs largely
> > between creating a general plan and creating a custom plan.
> >
> > 1. Create partition table which has 1024 partitions.
> > 2. Prepare select and update statements.
> > sel) prepare sel(int) as select * from pt where a = $1;
> > upd) prepare upd(int, int) as update pt set a = $2 where a = $1;
> > 3. Execute each statement for 8 times.
> > 3-1. Select from pg_stat_statements view after every execution.
> > select query, plans, total_plan_time, calls, total_exec_time from pg_stat_statements where query like 'prepare%';
> >
> >
> > Results of pg_stat_statements of sel) are
> > query | plans | total_plan_time | calls | total_exec_time
> > ---------------------------------------------------+-------+-----------------+-------+-----------------
> > prepare sel(int) as select * from pt where a = $1 | 1 | 0.164361 | 1 | 0.004613
> > prepare sel(int) as select * from pt where a = $1 | 2 | 0.27715500000000004 | 2 | 0.009447
> > prepare sel(int) as select * from pt where a = $1 | 3 | 0.39100100000000004 | 3 | 0.014281
> > prepare sel(int) as select * from pt where a = $1 | 4 | 0.504004 | 4 | 0.019265
> > prepare sel(int) as select * from pt where a = $1 | 5 | 0.628242 | 5 | 0.024091
> > prepare sel(int) as select * from pt where a = $1 | 7 | 24.213586000000003 | 6 | 0.029144
> > prepare sel(int) as select * from pt where a = $1 | 8 | 24.368900000000004 | 7 | 0.034099
> > prepare sel(int) as select * from pt where a = $1 | 9 | 24.527956000000003 | 8 | 0.046152
> >
> >
> > Results of pg_stat_statements of upd) are
> > prepare upd(int, int) as update pt set a = $2 where a = $1 | 1 | 0.280099 | 1 | 0.013138
> > prepare upd(int, int) as update pt set a = $2 where a = $1 | 2 | 0.405416 | 2 | 0.01894
> > prepare upd(int, int) as update pt set a = $2 where a = $1 | 3 | 0.532361 | 3 | 0.040716
> > prepare upd(int, int) as update pt set a = $2 where a = $1 | 4 | 0.671445 | 4 | 0.046566
> > prepare upd(int, int) as update pt set a = $2 where a = $1 | 5 | 0.798531 | 5 | 0.052729000000000005
> > prepare upd(int, int) as update pt set a = $2 where a = $1 | 7 | 896.915458 | 6 | 0.05888600000000001
> > prepare upd(int, int) as update pt set a = $2 where a = $1 | 8 | 897.043512 | 7 | 0.064446
> > prepare upd(int, int) as update pt set a = $2 where a = $1 | 9 | 897.169711 | 8 | 0.070644
> >
> >
> > How do you think about that?
>
> That's indeed a very valid point and something we should help user to
> investigate. I'll submit an updated patch with support for
> min/max/mean/stddev plan time shortly.

I attach v3 patches implementing those counters. Note that to avoid
duplicating some code (related to Welford's method), I switched some
of the counters to arrays rather than scalar variables. It
unfortunately makes pg_stat_statements_internal() a little bit messy,
but I hope that it's still acceptable. While doing this refactoring I
saw that previous patches were failing to accumulate the buffers used
during planning, which is now fixed.

Attachment Content-Type Size
0002-Add-planning-counters-to-pg_stat_statements-v3.patch application/octet-stream 45.5 KB
0001-Pass-query-string-to-the-planner-v3.patch application/octet-stream 10.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2019-11-09 13:43:30 Re: Missing constant propagation in planner on hash quals causes join slowdown
Previous Message Christoph Moench-Tegeder 2019-11-09 13:33:49 Re: Monitoring disk space from within the server