Re: shared-memory based stats collector

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: masao(dot)fujii(at)oss(dot)nttdata(dot)com
Cc: gkokolatos(at)protonmail(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: shared-memory based stats collector
Date: 2021-03-09 07:51:10
Message-ID: 20210309.165110.1259016108533244855.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

At Sat, 6 Mar 2021 00:32:07 +0900, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> wrote in
>
>
> On 2021/03/05 17:18, Kyotaro Horiguchi wrote:
> > At Thu, 21 Jan 2021 12:03:48 +0900 (JST), Kyotaro Horiguchi
> > <horikyota(dot)ntt(at)gmail(dot)com> wrote in
> >> Commit 960869da08 (database statistics) conflicted with this. Rebased.
> >>
> >> I'm concerned about the behavior that pgstat_update_connstats calls
> >> GetCurrentTimestamp() every time stats update happens (with intervals
> >> of 10s-60s in this patch). But I didn't change that design since that
> >> happens with about 0.5s intervals in master and the rate is largely
> >> reduced in this patch, to make this patch simpler.
> > I stepped on my foot, and another commit coflicted. Just rebased.
>
> Thanks for rebasing the patches!
>
> I think that 0003 patch is self-contained and useful, for example
> which
> enables us to monitor archiver process in pg_stat_activity. So IMO
> it's worth pusing 0003 patch firstly.

I'm not sure archiver process is worth realtime monitoring, but I
agree that the patch makes it possible. Anyway it is requried in this
patchset and I'm happy to see it committed beforehand.

Thanks for the review.

> Here are the review comments for 0003 patch.
>
> + /* Archiver process's latch */
> + Latch *archiverLatch;
> + /* Current shared estimate of appropriate spins_per_delay value */
>
> The last line in the above seems not necessary.

Oops. It seems like a garbage after a past rebasing. Removed.

> In proc.h, NUM_AUXILIARY_PROCS needs to be incremented.

Right. Increased to 5 and rewrote the comment.

> /* ----------
> * Functions called from postmaster
> * ----------
> */
> extern int pgarch_start(void);
>
> In pgarch.h, the above is not necessary.

Removed.

> +extern void XLogArchiveWakeup(void);
>
> This seems no longer necessary.
>
> +extern void XLogArchiveWakeupStart(void);
> +extern void XLogArchiveWakeupEnd(void);
> +extern void XLogArchiveWakeup(void);
>
> These seem also no longer necessary.

Sorry for many garbages. Removed all of them.

> PgArchPID = 0;
> if (!EXIT_STATUS_0(exitstatus))
> - LogChildExit(LOG, _("archiver process"),
> - pid, exitstatus);
> - if (PgArchStartupAllowed())
> - PgArchPID = pgarch_start();
> + HandleChildCrash(pid, exitstatus,
> + _("archiver process"));
>
> I don't think that we should treat non-zero exit condition as a crash,
> as before. Otherwise when archive_command fails on a signal,
> archiver emits FATAL error and which leads the server restart.

Sounds reasonable. Now archiver is treated the same way to wal
receiver. Specifically exit(1) doesn't cause server restart.

> - * walwriter, autovacuum, or background worker.
> + * walwriter, autovacuum, archiver or background worker.
> *
> * The objectives here are to clean up our local state about the child
> * process, and to signal all other remaining children to quickdie.
> @@ -3609,6 +3606,18 @@ HandleChildCrash(int pid, int exitstatus, const
> char *procname)
> signal_child(AutoVacPID, (SendStop ? SIGSTOP : SIGQUIT));
> }
> + /* Take care of the archiver too */
> + if (pid == PgArchPID)
> + PgArchPID = 0;
> + else if (PgArchPID != 0 && take_action)
> + {
> + ereport(DEBUG2,
> + (errmsg_internal("sending %s to process %d",
> + (SendStop ? "SIGSTOP" : "SIGQUIT"),
> + (int) PgArchPID)));
> + signal_child(PgArchPID, (SendStop ? SIGSTOP : SIGQUIT));
> + }
> +
>
> Same as above.

Mmm. In the first place, I found that I forgot to remove existing
code to handle archiver... Removed it instead of the above, which is
added by this patch. Since the process becomes an auxiliary process,
no reason to differentiate from other auxiliary processes in handling?

> In xlogarchive.c, "#include "storage/pmsignal.h"" is no longer
> necessary.

Removed.

> pgarch_forkexec() should be removed from pgarch.c because it's no
> longer used.

Right. Removed. EXEC_BACKEND still fails for another reason of
prototype mismatch of PgArchiverMain. Fixed it toghether.

> /* ------------------------------------------------------------
> * Public functions called from postmaster follow
> * ------------------------------------------------------------
> */
>
> The definition of PgArchiverMain() should be placed just
> after the above comment.

The module no longer have a function called from postmaster. Now
PgArchiverMain() is placed just below "/* Main entry point...".

> exit(0) in PgArchiverMain() should be proc_exit(0)?

Yeah, proc_exit() says as it should be the only function to call
exit() directly.

By the way, the patch (0003) removes the flag "wakened". The flag was
originally added to prevent spurious wakeups (66ec2db7284). At the
time the pg_ysleep in pgarch_MainLoop was replaced with WaitLatch by
89fd72cbf26, the flag survived but with losing its effect since
WaitLatch doesn't get spurious wakeups (AFAICS). So if the change
(removal of "wakened") is correct, it might worth another patch.

I'll send new patchset soon.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro Horiguchi 2021-03-09 07:53:11 Re: shared-memory based stats collector
Previous Message Justin Pryzby 2021-03-09 07:43:13 Re: [HACKERS] Custom compression methods