Re: O(n) tasks cause lengthy startups and checkpoints

From: Nathan Bossart <nathandbossart(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, "Bossart, Nathan" <bossartn(at)amazon(dot)com>, Maxim Orlov <orlovmg(at)gmail(dot)com>, Amul Sul <sulamul(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: O(n) tasks cause lengthy startups and checkpoints
Date: 2023-04-02 19:30:30
Message-ID: 20230402193030.GA25018@nathanxps13
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Apr 02, 2023 at 01:40:05PM -0400, Tom Lane wrote:
> I took a brief look through v20, and generally liked what I saw,
> but there are a few things troubling me:

Thanks for taking a look.

> * The comments for CustodianEnqueueTask claim that it won't enqueue an
> already-queued task, but I don't think I believe that, because it stops
> scanning as soon as it finds an empty slot. That data structure seems
> quite oddly designed in any case. Why isn't it simply an array of
> need-to-run-this-one booleans indexed by the CustodianTask enum?
> Fairness of dispatch could be ensured by the same state variable that
> CustodianGetNextTask already uses to track which array element to
> inspect next. While that wouldn't guarantee that tasks A and B are
> dispatched in the same order they were requested in, I'm not sure why
> we should care.

That works. Will update.

> * I don't much like cust_lck, mainly because you didn't bother to
> document what it protects (in general, CustodianShmemStruct deserves
> more than zero commentary). Do we need it at all? If the task-needed
> flags were sig_atomic_t not bool, we probably don't need it for the
> basic job of tracking which tasks remain to be run. I see that some
> of the tasks have possibly-non-atomically-assigned parameters to be
> transmitted, but restricting cust_lck to protect those seems like a
> better idea.

Will do.

> * Not quite convinced about handle_arg_func, mainly because the Datum
> API would be pretty inconvenient for any task with more than one arg.
> Why do we need that at all, rather than saying that callers should
> set up any required parameters separately before invoking
> RequestCustodian?

I had done it this way earlier, but added the Datum argument based on
feedback upthread [0]. It presently has only one proposed use, anyway, so
I think it would be fine to switch it back for now.

> * Why does LookupCustodianFunctions think it needs to search the
> constant array?

The order of the tasks in the array isn't guaranteed to match the order in
the CustodianTask enum.

> * The original proposal included moving RemovePgTempFiles into this
> mechanism, which I thought was probably the most useful bit of the
> whole thing. I'm sad to see that gone, what became of it?

I postponed that based on advice from upthread [1]. I was hoping to start
a dedicated thread for that immediately after the custodian infrastructure
was committed. FWIW I agree that it's the most useful task of what's
proposed thus far.

[0] https://postgr.es/m/20220703172732.wembjsb55xl63vuw%40awork3.anarazel.de
[1] https://postgr.es/m/CANbhV-EagKLoUH7tLEfg__VcLu37LY78F8gvLMzHrRZyZKm6sw%40mail.gmail.com

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Justin Pryzby 2023-04-02 19:48:52 Re: GUC for temporarily disabling event triggers
Previous Message Daniel Gustafsson 2023-04-02 19:24:33 Re: GUC for temporarily disabling event triggers