Re: O(n) tasks cause lengthy startups and checkpoints

From: "Euler Taveira" <euler(at)eulerto(dot)com>
To: "Bossart, Nathan" <bossartn(at)amazon(dot)com>, "Andres Freund" <andres(at)anarazel(dot)de>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: O(n) tasks cause lengthy startups and checkpoints
Date: 2021-12-02 02:05:03
Message-ID: d4c7e393-b8be-4879-ad4b-4e80994e5763@www.fastmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Dec 1, 2021, at 9:19 PM, Bossart, Nathan wrote:
> On 12/1/21, 2:56 PM, "Andres Freund" <andres(at)anarazel(dot)de> wrote:
> > On 2021-12-01 20:24:25 +0000, Bossart, Nathan wrote:
> >> I realize adding a new maintenance worker might be a bit heavy-handed,
> >> but I think it would be nice to have somewhere to offload tasks that
> >> really shouldn't impact startup and checkpointing. I imagine such a
> >> process would come in handy down the road, too. WDYT?
> >
> > -1. I think the overhead of an additional worker is disproportional here. And
> > there's simplicity benefits in having a predictable cleanup interlock as well.
>
> Another idea I had was to put some upper limit on how much time is
> spent on such tasks. For example, a checkpoint would only spend X
> minutes on CheckPointSnapBuild() before giving up until the next one.
> I think the main downside of that approach is that it could lead to
> unbounded growth, so perhaps we would limit (or even skip) such tasks
> only for end-of-recovery and shutdown checkpoints. Perhaps the
> startup tasks could be limited in a similar fashion.
Saying that a certain task is O(n) doesn't mean it needs a separate process to
handle it. Did you have a use case or even better numbers (% of checkpoint /
startup time) that makes your proposal worthwhile?

I would try to optimize (1) and (2). However, delayed removal can be a
long-term issue if the new routine cannot keep up with the pace of file
creation (specially if the checkpoints are far apart).

For (3), there is already a GUC that would avoid the slowdown during startup.
Use it if you think the startup time is more important that disk space occupied
by useless files.

For (4), you are forgetting that the on-disk state of replication slots is
stored in the pg_replslot/SLOTNAME/state. It seems you cannot just rename the
replication slot directory and copy the state file. What happen if there is a
crash before copying the state file?

While we are talking about items (1), (2) and (4), we could probably have an
option to create some ephemeral logical decoding files into ramdisk (similar to
statistics directory). I wouldn't like to hijack this thread but this proposal
could alleviate the possible issues that you pointed out. If people are
interested in this proposal, I can start a new thread about it.

--
Euler Taveira
EDB https://www.enterprisedb.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2021-12-02 02:33:31 Re: Data is copied twice when specifying both child and parent table in publication
Previous Message Michael Paquier 2021-12-02 01:49:22 Re: pg_upgrade test for binary compatibility of core data types