Re: add checkpoint stats of snapshot and mapping files of pg_logical dir

From: Nathan Bossart <nathandbossart(at)gmail(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Cary Huang <cary(dot)huang(at)highgo(dot)ca>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: add checkpoint stats of snapshot and mapping files of pg_logical dir
Date: 2022-03-15 20:17:05
Message-ID: 20220315201705.GA1087373@nathanxps13
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 15, 2022 at 11:04:26AM +0900, Michael Paquier wrote:
> On Mon, Mar 14, 2022 at 03:54:19PM +0530, Bharath Rupireddy wrote:
>> At times, the snapshot or mapping files can be large in number and one
>> some platforms it takes time for checkpoint to process all of them.
>> Having the stats about them in server logs can help us better analyze
>> why checkpoint took a long time and provide a better RCA.
>
> Do you have any numbers to share regarding that? Seeing information
> about 1k WAL segments being recycled and/or removed by a checkpoint
> where the operation takes dozens of seconds to complete because we can
> talk about hundred of gigs worth of files moved around. If we are
> talking about 100~200 files up to 10~20kB each for snapshot and
> mapping files, the information has less value, worth only a portion of
> one WAL segment.

I don't have specific numbers to share, but as noted elsewhere [0], I
routinely see lengthy checkpoints that spend a lot of time in these cleanup
tasks.

[0] https://postgr.es/m/18ED8B1F-7F5B-4ABF-848D-45916C938BC7%40amazon.com

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Verite 2022-03-15 20:19:41 Re: ICU for global collation
Previous Message David G. Johnston 2022-03-15 20:03:48 Re: pg14 psql broke \d datname.nspname.relname