Re: RFC: replace pg_stat_activity.waiting with something more descriptive

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Ildus Kurbangaliev <i(dot)kurbangaliev(at)postgrespro(dot)ru>
Cc: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: RFC: replace pg_stat_activity.waiting with something more descriptive
Date: 2015-07-13 10:36:39
Message-ID: CAA4eK1Je=qc0D=kWiLwyYXF4s=vjXuZYFMV_DLXCA5BaKy--9A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jul 13, 2015 at 3:26 PM, Ildus Kurbangaliev <
i(dot)kurbangaliev(at)postgrespro(dot)ru> wrote:

>
> On 07/12/2015 06:53 AM, Amit Kapila wrote:
>
> For having duration, I think you need to use gettimeofday or some
> similar call to calculate the wait time, now it will be okay for the
> cases where wait time is longer, however it could be problematic for
> the cases if the waits are very small (which could probably be the
> case for LWLocks)
>
> gettimeofday already used in our patch and it gives enough accuracy (in
> microseconds), especially when lwlock become a problem. Also we tested our
> realization and it gives overhead less than 1%. (
> http://www.postgresql.org/message-id/559D4729.9080704@postgrespro.ru,
> testing part).
>

I think that test is quite generic, we should test more combinations
(like use -M prepared option as that can stress LWLock machinery
somewhat more) and other type of tests which can stress the part
of code where gettimeofday() is used in patch.

> We need help here with testing on other platforms. I used gettimeofday
> because of builtin module "instr_time.h" that already gives cross-platform
> tested functions for measuring, but I'm planning to make similar
> implementation for monotonic functions based on clock_gettime for more
> accuracy.
>
>
> > 2) Accumulate per backend statistics about each wait event type: number
> of occurrences and total duration. With this statistics user can identify
> system bottlenecks again without sampling.
> >
> > Number #2 will be provided as a separate patch.
> > Number #1 require different concurrency model. ldus will extract it from
> "waits monitoring" patch shortly.
> >
>
> Sure, I think those should be evaluated as separate patches,
> and I can look into those patches and see if something more
> can be exposed as part of this patch which we can be reused in
> those patches.
>
> If you agree I'l do some modifications to your patch, so we can later
> extend it with our other modifications. Main issue is that one variable for
> all types is not enough. For flexibity in the future we need at least two -
> class and event, for example class=LWLock, event=ProcArrayLock, or
> class=Storage, and event=READ.
>

I have already proposed something very similar in this thread [1]
(where instead of class, I have used wait_event_type) to which
Robert doesn't agree, so here I think before writing code, it seems
prudent to get an agreement about what kind of User-Interface
would satisfy the requirement and will be extendible for future as
well. I think it will be better if you can highlight some points about
what kind of user-interface is better (extendible) and the reasons for
same.

[1] (Refer option-3) -
http://www.postgresql.org/message-id/CAA4eK1J6Cg_jYM00nrwt4n8r78Zn4LJoqY_zU1xRzXFq+mEY3g@mail.gmail.com

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2015-07-13 10:46:56 Re: Freeze avoidance of very large table.
Previous Message Sawada Masahiko 2015-07-13 10:09:04 Re: Freeze avoidance of very large table.