Re: O(n) tasks cause lengthy startups and checkpoints

From: "Bossart, Nathan" <bossartn(at)amazon(dot)com>
To: Maxim Orlov <orlovmg(at)gmail(dot)com>
Cc: Amul Sul <sulamul(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, "Bruce Momjian" <bruce(at)momjian(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, "Bharath Rupireddy" <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: O(n) tasks cause lengthy startups and checkpoints
Date: 2022-01-14 19:16:01
Message-ID: D71FD587-F027-442B-929D-7059184CA833@amazon.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 1/14/22, 3:43 AM, "Maxim Orlov" <orlovmg(at)gmail(dot)com> wrote:
> The code seems to be in good condition. All the tests are running ok with no errors.

Thanks for your review.

> I like the whole idea of shifting additional checkpointer jobs as much as possible to another worker. In my view, it is more appropriate to call this worker "bg cleaner" or "bg file cleaner" or smth.
>
> It could be useful for systems with high load, which may deal with deleting many files at once, but I'm not sure about "small" installations. Extra bg worker need more resources to do occasional deletion of small amounts of files. I really do not know how to do it better, maybe to have two different code paths switched by GUC?

I'd personally like to avoid creating two code paths for the same
thing. Are there really cases when this one extra auxiliary process
would be too many? And if so, how would a user know when to adjust
this GUC? I understand the point that we should introduce new
processes sparingly to avoid burdening low-end systems, but I don't
think we should be afraid to add new ones when it is needed.

That being said, if making the extra worker optional addresses the
concerns about resource usage, maybe we should consider it. Justin
suggested using something like max_parallel_maintenance_workers
upthread [0].

> Should we also think about adding WAL preallocation into custodian worker from the patch "Pre-alocationg WAL files" [1] ?

This was brought up in the pre-allocation thread [1]. I don't think
the custodian process would be the right place for it, and I'm also
not as concerned about it because it will generally be a small, fixed,
and configurable amount of work. In any case, I don't sense a ton of
support for a new auxiliary process in this thread, so I'm hesitant to
go down the same path for pre-allocation.

Nathan

[0] https://postgr.es/m/20211213171935.GX17618%40telsasoft.com
[1] https://postgr.es/m/B2ACCC5A-F9F2-41D9-AC3B-251362A0A254%40amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message James Coleman 2022-01-14 19:24:50 Re: Parallelize correlated subqueries that execute within each worker
Previous Message James Coleman 2022-01-14 19:15:40 Re: Parallelize correlated subqueries that execute within each worker