Re: [Proposal] Add accumulated statistics for wait event

From: Atsushi Torikoshi <atorik(at)gmail(dot)com>
To: "imai(dot)yoshikazu(at)fujitsu(dot)com" <imai(dot)yoshikazu(at)fujitsu(dot)com>
Cc: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "yoshikazu_i443(at)live(dot)jp" <yoshikazu_i443(at)live(dot)jp>
Subject: Re: [Proposal] Add accumulated statistics for wait event
Date: 2020-04-24 02:56:01
Message-ID: CACZ0uYHi0bjg6+EFhA=BM+c14rbu4tj5v+EyVJ=GY1kRjqqASw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Imai-san,

I feel your 'pg_stat_waitaccum' will help us investigate the bottleneck.
So I'd like to do some benchmarks but unfortunately, the latest v6 patch
couldn't be applied to HEAD anymore.

Is it possible to share the latest patches?
If not, I'll make v6 applicable to the HEAD.

Regards,

--
Atsushi Torikoshi

On Fri, Feb 28, 2020 at 5:17 PM imai(dot)yoshikazu(at)fujitsu(dot)com <
imai(dot)yoshikazu(at)fujitsu(dot)com> wrote:

> On Wed, Feb 26, 2020 at 1:39 AM, Kyotaro Horiguchi wrote:
> > Hello. I had a brief look on this and have some comments on this.
>
> Hi, Horiguchi-san. Thank you for looking at this!
>
> > It uses its own hash implement. Aside from the appropriateness of
> > having another implement of existing tool, in the first place hash
> > works well for wide, sparse and uncertain set of keys. But they are in
> > rather a dense and narrow set with certain and fixed members. It
> > registers more than 200 entries but bucket size is 461. It would be
> > needed to avoid colliisions, but seems a bit too wasting.
>
> Yes, wait events are grouped and wait events ID are defined as a sequential
> number, starting from specified number for each group(like 0x01000000U),
> thus
> keys are in a dense and narrow set.
>
> =====
> #define PG_WAIT_LWLOCK 0x01000000U
> #define PG_WAIT_LOCK 0x03000000U
> #define PG_WAIT_BUFFER_PIN 0x04000000U
> #define PG_WAIT_ACTIVITY 0x05000000U
> ...
>
> typedef enum
> {
> WAIT_EVENT_ARCHIVER_MAIN = PG_WAIT_ACTIVITY,
> WAIT_EVENT_AUTOVACUUM_MAIN,
> ...
> WAIT_EVENT_WAL_WRITER_MAIN
> } WaitEventActivity;
> =====
>
> The number 461 is the lowest number which avoids collisions in the hash for
> current wait events set. As you pointed out, there are a bit too much
> unused
> entries.
>
>
> If we prepare arrays for each wait classes with appropriate size which just
> fits the number of each wait event class, we can store wait event
> statistics
> into those arrays with no more wastes.
> Also we calculate hash number by "(wait event ID) % (bucket size)" in hash
> case, while we calculate index of arrays by "(wait event ID) - (wait event
> class id)"
> in array case. The latter one's cost is lower than the former one.
>
> Currently I implement wait event statistics store by hash because of its
> easiness of implementation, but I think it is good to implement it by array
> for above reasons. One concern is that we put restrictions on wait events
> that it must be defined as it is sequenced in its wait class so that there
> are no unused entries in the array.
>
>
> > It seems trying to avoid doing needless work when the feature is not
> > activated by checking "if (wa_hash==NULL)", but the hash is created
> > being filled with possible entries regardless of whether
> > track_wait_timing is on or off.
>
> This might be bad implementation but there are the cases "wa_hash = NULL"
> where pgstat_init() is not called like when executing "initdb". I insert
> that check for avoiding unexpected crash in those cases.
>
> I also noticed debug codes exist around that code...I will modify it.
>
> > As the result
> > pgstat_report_waitaccum_end calls pgstat_get_wa_entry tremendously
> > frequently. This should be the reason for the recent benchmark result.
> > I'm not sure such frequency of hash-searching is acceptable even if
> > the feature is turned on.
>
> track_wait_timing parameter determines whether we collect wait time.
> Regardless of that parameter, we always collect wait count every wait event
> happens. I think calling pgstat_get_wa_entry frequently is not critical to
> performance. From pavel's benchmark results, if track_wait_timing parameter
> is off, there are +/-1.0% performance difference between patched and
> unpatched
> and it is just considered as a measurement error.
>
>
> Thanks
> --
> Yoshikazu Imai
>
>
>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andy Fan 2020-04-24 03:25:14 Re: Trying to pull up EXPR SubLinks
Previous Message Masahiro Ikeda 2020-04-24 02:29:43 Why are wait events not reported even though it reads/writes a timeline history file?