Re: [Proposal] Add accumulated statistics for wait event

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Michael Paquier <michael(at)paquier(dot)xyz>
Cc: 임명규 <myungkyu(dot)lim(at)samsung(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, 홍도형 <don(dot)hong(at)samsung(dot)com>, 손우성 <woosung(dot)sohn(at)samsung(dot)com>
Subject: Re: [Proposal] Add accumulated statistics for wait event
Date: 2018-07-24 15:34:33
Message-ID: 0337d4a8-1632-9994-b629-c93f5ba68435@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 07/23/2018 03:57 PM, Tom Lane wrote:
> Michael Paquier <michael(at)paquier(dot)xyz> writes:
>> This does not need a configure switch.
>
> It probably is there because the OP realizes that most people wouldn't
> accept having this code compiled in.
>
>> What's the performance penalty? I am pretty sure that this is
>> measurable as wait events are stored for a backend for each I/O
>> operation as well, and you are calling a C routine within an inlined
>> function which is designed to be light-weight, doing only a four-byte
>> atomic operation.
>
> On machines with slow gettimeofday(), I suspect the cost of this
> patch would be staggering. Even with relatively fast gettimeofday,
> it doesn't look acceptable for calls in hot code paths (for instance,
> lwlock.c).
>

Yeah. I wonder if we could measure the time for a small fraction of the
wait events, and estimate the actual duration from that.

> A bigger problem is that it breaks stuff. There are countless
> calls to pgstat_report_wait_start/pgstat_report_wait_end that
> assume they have no side-effects (for example, on errno) and
> can never fail. I wouldn't trust GetCurrentTimestamp() for either.
> If the report_wait calls can't be dropped into code with *complete*
> certainty that they're safe, that's a big cost.
>
> Why exactly is this insisting on logging timestamps and not,
> say, just incrementing a counter? I think doing it like this
> is almost certain to end in rejection.
>

Because the number of times you hit wait event may not correlate with
the time you spent waiting on it. So a simple counter is not the most
useful thing.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2018-07-24 15:45:26 Re: [Proposal] Add accumulated statistics for wait event
Previous Message Melanie Plageman 2018-07-24 15:32:35 Re: Making "COPY partitioned_table FROM" faster