Re: pg_stat_io for the startup process

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: melanieplageman(at)gmail(dot)com
Cc: m(dot)melihmutlu(at)gmail(dot)com, robertmhaas(at)gmail(dot)com, andres(at)anarazel(dot)de, masao(dot)fujii(at)oss(dot)nttdata(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: pg_stat_io for the startup process
Date: 2023-04-28 02:15:51
Message-ID: 20230428.111551.498122818064927621.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

At Thu, 27 Apr 2023 17:30:40 -0400, Melanie Plageman <melanieplageman(at)gmail(dot)com> wrote in
> On Wed, Apr 26, 2023 at 2:53 PM Melih Mutlu <m(dot)melihmutlu(at)gmail(dot)com> wrote:
> >
> > Hi all,
> >
> > Robert Haas <robertmhaas(at)gmail(dot)com>, 26 Nis 2023 Çar, 15:34 tarihinde şunu yazdı:
> >>
> >> On Wed, Apr 26, 2023 at 5:47 AM Kyotaro Horiguchi
> >> <horikyota(dot)ntt(at)gmail(dot)com> wrote:
> >> > 3. When should we call pgstat_report_stats on the startup process?
> >> >
> >> > During recovery, I think we can call pgstat_report_stats() (or a
> >> > subset of it) right before invoking WaitLatch and at segment
> >> > boundaries.
> >>
> >> I think this kind of idea is worth exploring. Andres mentioned timers,
> >> but it seems to me that something where we just do it at certain
> >> "convenient" points might be good enough and simpler. I'd much rather
> >> have statistics that were up to date as of the last time we finished a
> >> segment than have nothing at all.
> >
> >
> > I created a rough prototype of a timer-based approach for comparison. Please see attached.
> >
> > Registered a new timeout named "STARTUP_STAT_FLUSH_TIMEOUT", The timeout's
> > handler sets a flag indicating that io stats need to be flushed.
> > HandleStartupProcInterrupts checks the flag to flush stats. It's enabled if
> > any WAL record is read (in the main loop of PerformWalRecovery). And it's
> > disabled before WaitLatch to avoid unnecessary wakeups if the startup process
> > decides to sleep.
>
> I was reviewing this and found that if I remove the calls to
> disable_startup_stat_flush_timeout() the number of calls to
> pgstat_report_wal() on a briefly active and then idle standby are about
> 8 in 100 seconds whereas with timer disablement, the calls over the same
> period are about 40. I would have thought that disabling the timer would
> have caused us to try and flush the stats less often.
>
> With the calls to disable_startup_stat_flush_timeout(), we do have much
> fewer calls to setitimer(), of course.
>
> Given the below suggestion by Andres, I tried doing some traces of the
> various approaches.
>
> > > + disable_startup_stat_flush_timeout();
> > > +
> > > (void) WaitLatch(&XLogRecoveryCtl->recoveryWakeupLatch,
> > > WL_LATCH_SET | WL_TIMEOUT |
> > > WL_EXIT_ON_PM_DEATH,
> >
> > I think always disabling the timer here isn't quite right - we want to wake up
> > *once* in WaitForWALToBecomeAvailable(), otherwise we'll not submit pending
> > stats before waiting - potentially for a long time - for WAL. One way would be
> > just explicitly report before the WaitLatch().
> >
> > Another approach, I think better, would be to not use enable_timeout_every(),
> > and to explicitly rearm the timer in HandleStartupProcInterrupts(). When
> > called from WaitForWALToBecomeAvailable(), we'd not rearm, and instead do so
> > at the end of WaitForWALToBecomeAvailable(). That way we also wouldn't
> > repeatedly fire between calls to HandleStartupProcInterrupts().
>
> After a quick example implementation of this, I found that it seemed to
> try and flush the stats less often on an idle standby (good) than using
> enable_timeout_every().

Just rearming with the full-interval will work that way. Our existing
strategy for this is seen in PostgresMain().

stats_timeout = pgstat_report_stat(false);
if (stats_timeout > 0)
{
if (!get_timeout_active(BLAH_TIMEOUT))
enable_timeout_after(BLAH_TIMEOUT, stats_timeout);
}
else
{
if (get_timeout_active(BLAH_TIMEOUT))
disable_timeout(BLAH_TIMEOUT, false);
}
WaitLatch();

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro Horiguchi 2023-04-28 02:43:48 Re: pg_stat_io for the startup process
Previous Message Yu Shi (Fujitsu) 2023-04-28 02:05:39 RE: Fix a test case in 035_standby_logical_decoding.pl