Re: O(n) tasks cause lengthy startups and checkpoints

From: Simon Riggs <simon(dot)riggs(at)enterprisedb(dot)com>
To: Nathan Bossart <nathandbossart(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, "Bossart, Nathan" <bossartn(at)amazon(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Maxim Orlov <orlovmg(at)gmail(dot)com>, Amul Sul <sulamul(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: O(n) tasks cause lengthy startups and checkpoints
Date: 2022-06-23 11:58:15
Message-ID: CANbhV-H9=DOBdpKE77XB_hiPDKVre-+1vuLvKWTke07Y3K1P5w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, 18 Feb 2022 at 20:51, Nathan Bossart <nathandbossart(at)gmail(dot)com> wrote:
>
> > On Thu, Feb 17, 2022 at 03:12:47PM -0800, Andres Freund wrote:
> >>> > The improvements around deleting temporary files and serialized snapshots
> >>> > afaict don't require a dedicated process - they're only relevant during
> >>> > startup. We could use the approach of renaming the directory out of the way as
> >>> > done in this patchset but perform the cleanup in the startup process after
> >>> > we're up.
>
> BTW I know you don't like the dedicated process approach, but one
> improvement to that approach could be to shut down the custodian process
> when it has nothing to do.

Having a central cleanup process makes a lot of sense. There is a long
list of potential tasks for such a process. My understanding is that
autovacuum already has an interface for handling additional workload
types, which is how BRIN indexes are handled. Do we really need a new
process? If so, lets do this now.

Nathan's point that certain tasks are blocking fast startup is a good
one and higher availability is a critical end goal. The thought that
we should complete these tasks during checkpoint is a good one, but
checkpoints should NOT be delayed by long running tasks that are
secondary to availability.

Andres' point that it would be better to avoid long running tasks is
good, if that is possible. That can be done better over time. This
point does not block the higher level goal of better availability
asap, so I support Nathan's overall proposals.

--
Simon Riggs http://www.EnterpriseDB.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2022-06-23 12:09:19 Re: Replica Identity check of partition table on subscriber
Previous Message Nikita Malakhov 2022-06-23 11:46:07 Re: Pluggable toaster