Re: O(n) tasks cause lengthy startups and checkpoints

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Nathan Bossart <nathandbossart(at)gmail(dot)com>
Cc: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, "Bossart, Nathan" <bossartn(at)amazon(dot)com>, Maxim Orlov <orlovmg(at)gmail(dot)com>, Amul Sul <sulamul(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: O(n) tasks cause lengthy startups and checkpoints
Date: 2023-04-02 17:40:05
Message-ID: 1031491.1680457205@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Nathan Bossart <nathandbossart(at)gmail(dot)com> writes:
> another rebase for cfbot

I took a brief look through v20, and generally liked what I saw,
but there are a few things troubling me:

* The comments for CustodianEnqueueTask claim that it won't enqueue an
already-queued task, but I don't think I believe that, because it stops
scanning as soon as it finds an empty slot. That data structure seems
quite oddly designed in any case. Why isn't it simply an array of
need-to-run-this-one booleans indexed by the CustodianTask enum?
Fairness of dispatch could be ensured by the same state variable that
CustodianGetNextTask already uses to track which array element to
inspect next. While that wouldn't guarantee that tasks A and B are
dispatched in the same order they were requested in, I'm not sure why
we should care.

* I don't much like cust_lck, mainly because you didn't bother to
document what it protects (in general, CustodianShmemStruct deserves
more than zero commentary). Do we need it at all? If the task-needed
flags were sig_atomic_t not bool, we probably don't need it for the
basic job of tracking which tasks remain to be run. I see that some
of the tasks have possibly-non-atomically-assigned parameters to be
transmitted, but restricting cust_lck to protect those seems like a
better idea.

* Not quite convinced about handle_arg_func, mainly because the Datum
API would be pretty inconvenient for any task with more than one arg.
Why do we need that at all, rather than saying that callers should
set up any required parameters separately before invoking
RequestCustodian?

* Why does LookupCustodianFunctions think it needs to search the
constant array?

* The original proposal included moving RemovePgTempFiles into this
mechanism, which I thought was probably the most useful bit of the
whole thing. I'm sad to see that gone, what became of it?

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2023-04-02 17:46:57 Re: logical decoding and replication of sequences, take 2
Previous Message Daniel Gustafsson 2023-04-02 17:37:55 Re: Add "host" to startup packet