Re: O(n) tasks cause lengthy startups and checkpoints

From: SATYANARAYANA NARLAPURAM <satyanarlapuram(at)gmail(dot)com>
To: "Bossart, Nathan" <bossartn(at)amazon(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: O(n) tasks cause lengthy startups and checkpoints
Date: 2021-12-01 21:35:00
Message-ID: CAHg+QDfhFgdCj2MEfDNad7z213A27JP-JomDh62+Tjqirx0-Ug@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

+1 to the idea. I don't see a reason why checkpointer has to do all of
that. Keeping checkpoint to minimal essential work helps servers recover
faster in the event of a crash.

RemoveOldXlogFiles is also an O(N) operation that can at least be avoided
during the end of recovery (CHECKPOINT_END_OF_RECOVERY) checkpoint. When a
sufficient number of WAL files accumulated and the previous checkpoint did
not get a chance to cleanup, this can increase the unavailability of the
server.

RemoveOldXlogFiles(_logSegNo, RedoRecPtr, recptr);

On Wed, Dec 1, 2021 at 12:24 PM Bossart, Nathan <bossartn(at)amazon(dot)com> wrote:

> Hi hackers,
>
> Thanks to 61752af, SyncDataDirectory() can make use of syncfs() to
> avoid individually syncing all database files after a crash. However,
> as noted earlier this year [0], there are still a number of O(n) tasks
> that affect startup and checkpointing that I'd like to improve.
> Below, I've attempted to summarize each task and to offer ideas for
> improving matters. I'll likely split each of these into its own
> thread, given there is community interest for such changes.
>
> 1) CheckPointSnapBuild(): This function loops through
> pg_logical/snapshots to remove all snapshots that are no longer
> needed. If there are many entries in this directory, this can take
> a long time. The note above this function indicates that this is
> done during checkpoints simply because it is convenient. IIUC
> there is no requirement that this function actually completes for a
> given checkpoint. My current idea is to move this to a new
> maintenance worker.
> 2) CheckPointLogicalRewriteHeap(): This function loops through
> pg_logical/mappings to remove old mappings and flush all remaining
> ones. IIUC there is no requirement that the "remove old mappings"
> part must complete for a given checkpoint, but the "flush all
> remaining" portion allows replay after a checkpoint to only "deal
> with the parts of a mapping that have been written out after the
> checkpoint started." Therefore, I think we should move the "remove
> old mappings" part to a new maintenance worker (probably the same
> one as for 1), and we should consider using syncfs() for the "flush
> all remaining" part. (I suspect the main argument against the
> latter will be that it could cause IO spikes.)
> 3) RemovePgTempFiles(): This step can delay startup if there are many
> temporary files to individually remove. This step is already
> optionally done after a crash via the remove_temp_files_after_crash
> GUC. I propose that we have startup move the temporary file
> directories aside and create new ones, and then a separate worker
> (probably the same one from 1 and 2) could clean up the old files.
> 4) StartupReorderBuffer(): This step deletes logical slot data that
> has been spilled to disk. This code appears to be written to avoid
> deleting different types of files in these directories, but AFAICT
> there shouldn't be any other files. Therefore, I think we could do
> something similar to 3 (i.e., move the directories aside during
> startup and clean them up via a new maintenance worker).
>
> I realize adding a new maintenance worker might be a bit heavy-handed,
> but I think it would be nice to have somewhere to offload tasks that
> really shouldn't impact startup and checkpointing. I imagine such a
> process would come in handy down the road, too. WDYT?
>
> Nathan
>
> [0] https://postgr.es/m/32B59582-AA6C-4609-B08F-2256A271F7A5%40amazon.com
>
>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Smith 2021-12-01 21:36:58 Re: row filtering for logical replication
Previous Message Peter Smith 2021-12-01 21:31:06 Re: row filtering for logical replication