Re: O(n) tasks cause lengthy startups and checkpoints

From: Nathan Bossart <nathandbossart(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Simon Riggs <simon(dot)riggs(at)enterprisedb(dot)com>, "Bossart, Nathan" <bossartn(at)amazon(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Maxim Orlov <orlovmg(at)gmail(dot)com>, Amul Sul <sulamul(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: O(n) tasks cause lengthy startups and checkpoints
Date: 2022-11-28 23:40:39
Message-ID: 20221128234039.GA1119654@nathanxps13
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Okay, here is a new patch set. 0004 adds logic to prevent custodian tasks
from delaying shutdown.

I haven't added any logging for long-running tasks yet. Tasks might
ordinarily take a while, so such logs wouldn't necessarily indicate
something is wrong. Perhaps we could add a GUC for the amount of time to
wait before logging. This feature would be off by default. Another option
could be to create a log_custodian GUC that causes tasks to be logged when
completed, similar to log_checkpoints. Thoughts?

On Mon, Nov 28, 2022 at 01:37:01PM -0500, Robert Haas wrote:
> On Mon, Nov 28, 2022 at 1:31 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
>> On 2022-11-28 13:08:57 +0000, Simon Riggs wrote:
>> > On Sun, 27 Nov 2022 at 23:34, Nathan Bossart <nathandbossart(at)gmail(dot)com> wrote:
>> > > > Rather than explicitly use DEBUG1 everywhere I would have an
>> > > > #define CUSTODIAN_LOG_LEVEL LOG
>> > > > so we can run with it in LOG mode and then set it to DEBUG1 with a one
>> > > > line change in a later phase of Beta
>> > >
>> > > I can create a separate patch for this, but I don't think I've ever seen
>> > > this sort of thing before.
>> >
>> > Much of recovery is coded that way, for the same reason.
>>
>> I think that's not a good thing to copy without a lot more justification than
>> "some old code also does it that way". It's sometimes justified, but also
>> makes code harder to read (one doesn't know what it does without looking up
>> the #define, line length).
>
> Yeah. If people need some of the log messages at a higher level during
> development, they can patch their own copies.
>
> I think there might be some argument for having a facility that lets
> you pick subsystems or even individual messages that you want to trace
> and pump up the log level for just those call sites. But I don't know
> exactly what that would look like, and I don't think inventing one-off
> mechanisms for particular cases is a good idea.

Given this discussion, I haven't made any changes to the logging in the new
patch set.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Attachment Content-Type Size
v15-0001-Introduce-custodian.patch text/x-diff 24.3 KB
v15-0002-Move-removal-of-old-serialized-snapshots-to-cust.patch text/x-diff 4.5 KB
v15-0003-Move-removal-of-old-logical-rewrite-mapping-file.patch text/x-diff 9.2 KB
v15-0004-Do-not-delay-shutdown-due-to-long-running-custod.patch text/x-diff 2.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2022-11-28 23:42:48 Re: fixing CREATEROLE
Previous Message David G. Johnston 2022-11-28 23:31:49 Re: fixing CREATEROLE