Re: when the startup process doesn't (logging startup delays)

From: Nitin Jadhav <nitinjadhavpostgres(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)anarazel(dot)de>, Stephen Frost <sfrost(at)snowman(dot)net>, Magnus Hagander <magnus(at)hagander(dot)net>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: when the startup process doesn't (logging startup delays)
Date: 2021-08-10 13:28:38
Message-ID: CAMm1aWZGP82buv0st6_aeFoERcOyZCDSRMN-7VxC4-_eFVEoMQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> I'd really like to avoid this. I don't see why it's necessary. You say
> it causes a problem, but you don't explain what problem it causes.
> enable_timeout_at() will disable the timer if not already done. I
> think all we need to do is set scheduled_startup_progress_timeout = 0
> before calling reset_startup_progress_timeout() in the "init" case and
> don't do that for the non-init case. If that's not quite right, maybe
> you can work out something that does work. But adding an is_init flag
> to a function and having no common code between the is_init = true
> case and the is_init = false case is exactly the kind of thing that I
> don't want here. I want as much common code as possible.

Setting set scheduled_startup_progress_timeout = 0 in the "init" case
solves the problem. The problem was if we call init_start_progress()
continuously then the first call to reset_startup_progress_timeout()
sets the value of scheduled_startup_progress_timeout to "now +
interval". Later call to reset_startup_progress_timeout() uses the
previously set value of scheduled_startup_progress_timeout which was
not correct and it was not behaving as expected. I could see that the
first log gets printed far later than the expected interval.

> To some degree, we tend to use names_like_this() for low-level
> functions and NamesLikeThis() for higher-level functions, but that is
> not a very consistent practice.

Ok. Thanks for the information.

> But this strategy cannot be used if the drift is larger than the
> interval. If we are trying to log a message every 1000ms and the timer
> doesn't fire for 14ms, we can wait only 986ms the next time. If it
> doesn't fire for 140ms, we can wait only 860ms the next time. But if
> the timer doesn't fire for 1400ms, we cannot wait for -400ms the next
> time. So what should we do? My proposal is to just wait for the
> configured interval, 1000ms, essentially giving up on drift
> correction. Now you could argue that we ought to just wait for 600ms
> in the hopes of making it 2 * 1000ms after the previous status
> message, but I'm not sure that really has any value, and it doesn't
> seem especially likely to work. The only way timer interrupts are
> likely to be that badly delayed is if the system is horrifically
> overloaded, and if that's the case the next timer interrupt isn't
> likely to fire on schedule anyway. Trying to correct for drift in such
> a situation seems more likely to be confusing than to produce any
> helpful result.

This is what I was trying to convey in case-2. I agree that it is
better to consider "now + interval" in such a case instead of trying
to adjust the drift.

Please find the updated patch attached.

On Tue, Aug 10, 2021 at 1:06 AM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> On Mon, Aug 9, 2021 at 11:20 AM Nitin Jadhav
> <nitinjadhavpostgres(at)gmail(dot)com> wrote:
> > Modified the reset_startup_progress_timeout() to take the second
> > parameter which indicates whether it is called for initialization or
> > for resetting. Without this parameter there is a problem if we call
> > init_startup_progress() more than one time if there is no call to
> > ereport_startup_progress() in between as the code related to disabling
> > the timer has been removed.
>
> I'd really like to avoid this. I don't see why it's necessary. You say
> it causes a problem, but you don't explain what problem it causes.
> enable_timeout_at() will disable the timer if not already done. I
> think all we need to do is set scheduled_startup_progress_timeout = 0
> before calling reset_startup_progress_timeout() in the "init" case and
> don't do that for the non-init case. If that's not quite right, maybe
> you can work out something that does work. But adding an is_init flag
> to a function and having no common code between the is_init = true
> case and the is_init = false case is exactly the kind of thing that I
> don't want here. I want as much common code as possible.
>
> > > This makes sense, but I think I'd like to have all the functions in
> > > this patch use names_like_this() rather than NamesLikeThis().
> >
> > I have changed all the function names accordingly. But I would like to
> > know why it should be names_like_this() as there are many functions
> > with the format NamesLikeThis(). I would like to understand when to
> > use what, just to incorporate in the future patches.
>
> There is, unfortunately, no hard-and-fast rule, but we want to
> maintain as much consistency with the existing style as we can. I
> wasn't initially sure what would work best for this particular patch,
> but after we committed to a name like ereport_startup_progress() that
> to me was a strong hint in favor of using names_like_this()
> throughout. It seems impossible to imagine punctuating it as
> EreportStartupProgress() or something since that would be wildly
> inconsistent with the existing function name, and there seems to be no
> good reason why this patch can't be internally consistent.
>
> To some degree, we tend to use names_like_this() for low-level
> functions and NamesLikeThis() for higher-level functions, but that is
> not a very consistent practice.
>
> > > reset_startup_progress_timeout(TimeStampTz now)
> > > {
> > > next_timeout = last_startup_progress_timeout + interval;
> > > if (next_timeout < now)
> > > {
> > > // Either the timeout was processed so late that we missed an entire cycle,
> > > // or the system clock was set backwards.
> > > next_timeout = now + interval;
> > > }
> > > enable_timeout_at(next_timeout);
> > > last_startup_progress_timeout = next_timeout;
> > > }
> >
> > As per the above logic, I would like to discuss 2 cases.
> >
> > Case-1:
> > First scheduled timeout is after 1 sec. But the time out occurred
> > after 1.5 sec. So the log msg prints after 1.5 sec. Next timer is
> > scheduled after 2 sec (scheduled_startup_progress_timeout + interval).
> > The condition (next_timeout < now) gets evaluated to false and
> > everything goes smooth.
> >
> > Case-2:
> > First scheduled timeout is after 1 sec. But the timeout occurred after
> > 2.5 sec. So the log msg prints after 2.5 sec. Now next time is
> > scheduled after 2 sec (scheduled_startup_progress_timeout + interval).
> > So the condition (next_timeout < now) will fail and the next_timeout
> > will get reset to 3.5 sec (2.5 + 1) and it continues. Is this
> > behaviour ok or should we set the next_timeout to 3 sec. Please share
> > your thoughts on this.
>
> I can't quite follow this, because it seems like you are sometimes
> viewing the interval as 1 second and sometimes as 2 seconds. Maybe you
> could clarify that, and perhaps show example output?
>
> My feeling is that the timer will almost always be slightly late, but
> it will very rarely be extremely late, and it will also very rarely be
> early (only if someone resets the system clock). So let's consider
> those two cases separately. If the timer is a little bit late each
> time, we want to avoid drift, so we want to shorten the next sleep
> time by the amount that the previous interrupt was late. If the
> interval is 1000ms and the interrupt fires 1ms late then we should
> sleep 999ms the next time; if 2ms late, 998ms. That way, although
> there will be some variation in which the messages are logged, the
> drift won't accumulate over time and even after many minutes of
> recovery the messages will be printed at ABOUT the same number of
> milliseconds after the second every time, instead of drifting further
> and further off course.
>
> But this strategy cannot be used if the drift is larger than the
> interval. If we are trying to log a message every 1000ms and the timer
> doesn't fire for 14ms, we can wait only 986ms the next time. If it
> doesn't fire for 140ms, we can wait only 860ms the next time. But if
> the timer doesn't fire for 1400ms, we cannot wait for -400ms the next
> time. So what should we do? My proposal is to just wait for the
> configured interval, 1000ms, essentially giving up on drift
> correction. Now you could argue that we ought to just wait for 600ms
> in the hopes of making it 2 * 1000ms after the previous status
> message, but I'm not sure that really has any value, and it doesn't
> seem especially likely to work. The only way timer interrupts are
> likely to be that badly delayed is if the system is horrifically
> overloaded, and if that's the case the next timer interrupt isn't
> likely to fire on schedule anyway. Trying to correct for drift in such
> a situation seems more likely to be confusing than to produce any
> helpful result.
>
> --
> Robert Haas
> EDB: http://www.enterprisedb.com

Attachment Content-Type Size
v11-0001-startup-process-progress.patch application/octet-stream 12.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2021-08-10 13:35:00 Re: replay of CREATE TABLESPACE eats data at wal_level=minimal
Previous Message Alvaro Herrera 2021-08-10 13:27:33 Re: Autovacuum on partitioned table (autoanalyze)