Re: pg_stat_io for the startup process

From: Andres Freund <andres(at)anarazel(dot)de>
To: Melih Mutlu <m(dot)melihmutlu(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, masao(dot)fujii(at)oss(dot)nttdata(dot)com, pgsql-hackers(at)postgresql(dot)org, melanieplageman(at)gmail(dot)com
Subject: Re: pg_stat_io for the startup process
Date: 2023-04-27 16:27:43
Message-ID: 20230427162743.5rpv4fbj3m2tqq74@awork3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2023-04-26 21:53:16 +0300, Melih Mutlu wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com>, 26 Nis 2023 Çar, 15:34 tarihinde şunu
> yazdı:
>
> > On Wed, Apr 26, 2023 at 5:47 AM Kyotaro Horiguchi
> > <horikyota(dot)ntt(at)gmail(dot)com> wrote:
> > > 3. When should we call pgstat_report_stats on the startup process?
> > >
> > > During recovery, I think we can call pgstat_report_stats() (or a
> > > subset of it) right before invoking WaitLatch and at segment
> > > boundaries.
> >
> > I think this kind of idea is worth exploring. Andres mentioned timers,
> > but it seems to me that something where we just do it at certain
> > "convenient" points might be good enough and simpler. I'd much rather
> > have statistics that were up to date as of the last time we finished a
> > segment than have nothing at all.
> >
>
> I created a rough prototype of a timer-based approach for comparison.
> Please see attached.

Thanks!

> 2- I'm also not sure if this timeout should be registered at the beginning
> of StartupProcessMain, or does it even make any difference. I tried to do
> this just before the main redo loop in PerformWalRecovery, but that made
> the CI red.

Huh - do you have a link to the failure? That's how I would expect it to be
done.

> /* Unsupported old recovery command file names (relative to $PGDATA) */
> #define RECOVERY_COMMAND_FILE "recovery.conf"
> @@ -1675,6 +1676,9 @@ PerformWalRecovery(void)
> ereport_startup_progress("redo in progress, elapsed time: %ld.%02d s, current LSN: %X/%X",
> LSN_FORMAT_ARGS(xlogreader->ReadRecPtr));
>
> + /* Is this the right place to enable this? */
> + enable_startup_stat_flush_timeout();
> +

I think we should try not have additional code for this inside the loop - we
should enable the timer once, when needed, not repeatedly.

> #ifdef WAL_DEBUG
> if (XLOG_DEBUG ||
> (record->xl_rmid == RM_XACT_ID && trace_recovery_messages <= DEBUG2) ||
> @@ -3617,6 +3621,13 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
> /* Do background tasks that might benefit us later. */
> KnownAssignedTransactionIdsIdleMaintenance();
>
> + /*
> + * Need to disable flush timeout to avoid unnecessary
> + * wakeups. Enable it again after a WAL record is read
> + * in PerformWalRecovery.
> + */
> + disable_startup_stat_flush_timeout();
> +
> (void) WaitLatch(&XLogRecoveryCtl->recoveryWakeupLatch,
> WL_LATCH_SET | WL_TIMEOUT |
> WL_EXIT_ON_PM_DEATH,

I think always disabling the timer here isn't quite right - we want to wake up
*once* in WaitForWALToBecomeAvailable(), otherwise we'll not submit pending
stats before waiting - potentially for a long time - for WAL. One way would be
just explicitly report before the WaitLatch().

Another approach, I think better, would be to not use enable_timeout_every(),
and to explicitly rearm the timer in HandleStartupProcInterrupts(). When
called from WaitForWALToBecomeAvailable(), we'd not rearm, and instead do so
at the end of WaitForWALToBecomeAvailable(). That way we also wouldn't
repeatedly fire between calls to HandleStartupProcInterrupts().

> diff --git a/src/backend/postmaster/startup.c b/src/backend/postmaster/startup.c
> index efc2580536..b250fa95f9 100644
> --- a/src/backend/postmaster/startup.c
> +++ b/src/backend/postmaster/startup.c
> @@ -72,6 +72,11 @@ static TimestampTz startup_progress_phase_start_time;
> */
> static volatile sig_atomic_t startup_progress_timer_expired = false;
>
> +/* Indicates whether flushing stats is needed. */
> +static volatile sig_atomic_t startup_stat_need_flush = false;
> +
> +int pgstat_stat_flush_timeout = 1000; /* 1 sec ?? */

We probably should move the existing PGSTAT_MIN_INTERVAL constant from
pgstat.c to pgstat.h.

> +extern void enable_startup_stat_flush_timeout(void);
> +extern void disable_startup_stat_flush_timeout(void);
> +extern void startup_stat_flush_timeout_handler(void);
> +
> #endif /* _STARTUP_H */
> diff --git a/src/include/utils/timeout.h b/src/include/utils/timeout.h
> index e561a1cde9..a8d360e255 100644
> --- a/src/include/utils/timeout.h
> +++ b/src/include/utils/timeout.h
> @@ -35,6 +35,7 @@ typedef enum TimeoutId
> IDLE_STATS_UPDATE_TIMEOUT,
> CLIENT_CONNECTION_CHECK_TIMEOUT,
> STARTUP_PROGRESS_TIMEOUT,
> + STARTUP_STAT_FLUSH_TIMEOUT,
> /* First user-definable timeout reason */
> USER_TIMEOUT,
> /* Maximum number of timeout reasons */

I think we could just reuse IDLE_STATS_UPDATE_TIMEOUT?

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Steele 2023-04-27 16:35:25 Re: Possible regression setting GUCs on \connect
Previous Message Tom Lane 2023-04-27 16:13:21 Re: Possible regression setting GUCs on \connect