Re: Planning counters in pg_stat_statements (using pgss_store)

From: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
To: Julien Rouhaud <rjuju123(at)gmail(dot)com>
Cc: Sergei Kornilov <sk(at)zsrv(dot)org>, "imai(dot)yoshikazu(at)fujitsu(dot)com" <imai(dot)yoshikazu(at)fujitsu(dot)com>, legrand legrand <legrand_legrand(at)hotmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Planning counters in pg_stat_statements (using pgss_store)
Date: 2020-03-27 13:01:40
Message-ID: 807c7d80-c748-3e86-d6e7-cfa01184ee92@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2020/03/27 19:00, Julien Rouhaud wrote:
> On Thu, Mar 26, 2020 at 02:22:47PM +0100, Julien Rouhaud wrote:
>> On Thu, Mar 26, 2020 at 08:08:35PM +0900, Fujii Masao wrote:
>>>
>>> Here are other comments.
>>>
>>> - if (jstate)
>>> + if (kind == PGSS_JUMBLE)
>>>
>>> Why is PGSS_JUMBLE necessary? ISTM that we can still use jstate here, instead.
>>>
>>> If it's ok to remove PGSS_JUMBLE, we can define PGSS_NUM_KIND(=2) instead
>>> and replace 2 in, e.g., total_time[2] with PGSS_NUM_KIND. Thought?
>>
>> Yes, we could be using jstate here. I originally used that to avoid passing
>> PGSS_EXEC (or the other one) as a way to say "ignore this information as
>> there's the jstate which says it's yet another meaning". If that's not an
>> issue, I can change that as PGSS_NUM_KIND will clearly improve the explicit "2"
>> all over the place.
>
> Done, passing PGSS_PLAN when jumble is intended, with a comment saying that the
> pgss_kind is ignored in that case.
>
>>> + <entry><structfield>total_time</structfield></entry>
>>> + <entry><type>double precision</type></entry>
>>> + <entry></entry>
>>> + <entry>
>>> + Total time spend planning and executing the statement, in milliseconds
>>> + </entry>
>>> + </row>
>>>
>>> pg_stat_statements view has this column but the function not.
>>> We should make both have the column or not at all, for consistency?
>>> I'm not sure if it's good thing to expose the sum of total_plan_time
>>> and total_exec_time as total_time. If some users want that, they can
>>> easily calculate it from total_plan_time and total_exec_time by using
>>> their own logic.
>>
>> I think we originally added it as a way to avoid too much compatibility break,
>> and also because it seems like a field most users will be interested in anyway.
>> Now that I'm thinking about it again, I indeed think it was a mistake to have
>> that in view part only. Not mainly for consistency, but for users who would be
>> interested in the total_time field while not wanting to pay the overhead of
>> retrieving the query text if they don't need it. So I'll change that!
>
> Done
>

Thanks for updating the patch! But I'm still wondering if it's really
good thing to expose total_time. For example, when the query fails
with an error many times and "calls" becomes very different from
"plans", "total_plan_time" + "total_exec_time" is really what the users
are interested in? Some users may be interested in the sum of mean
times, but it's not exposed...

So what I'd like to say is that the information that users are interested
in would vary on each situation and case. At least for me it seems
enough for pgss to report only the basic information. Then users
can calculate to get the numbers (like total_time) they're interested in,
from those basic information.

But of course, I'd like to hear more opinions about this...

+ if (api_version >= PGSS_V1_8)
+ values[i++] = Int64GetDatumFast(tmp.total_time[0] +
+ tmp.total_time[1]);

BTW, Int64GetDatumFast() should be Float8GetDatumFast()?

>>> + nested_level++;
>>> + PG_TRY();
>>>
>>> In old thread [1], Tom Lane commented the usage of nested_level
>>> in the planner hook. There seems no reply to that so far. What's
>>> your opinion about that comment?
>>>
>>> [1] https://www.postgresql.org/message-id/28980.1515803777@sss.pgh.pa.us
>>
>> Oh thanks, I didn't noticed this part of the discussion. I agree with Tom's
>> concern, and I think that having a specific nesting level variable for the
>> planner is the best workaround, so I'll implement that.
>
> Done.
>
> I also exported BufferUsageAccumDiff as mentioned previously, as it seems
> clearner and will avoid future useless code churn, and run pgindent.

Many thanks!! I'm thinking to commit this part separately.
So I made that patch based on your patch. Attached.

Regards,

--
Fujii Masao
NTT DATA CORPORATION
Advanced Platform Technology Group
Research and Development Headquarters

Attachment Content-Type Size
expose_BufferUsageAccumDiff_v1.patch text/plain 3.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Muhammad Usama 2020-03-27 13:06:14 Re: Transactions involving multiple postgres foreign servers, take 2
Previous Message Peter Eisentraut 2020-03-27 11:29:23 Re: INSERT ... OVERRIDING USER VALUE vs GENERATED ALWAYS identity columns