Re: Allow pg_archivecleanup to remove backup history files

From: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Cc: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Allow pg_archivecleanup to remove backup history files
Date: 2023-05-31 01:51:19
Message-ID: be11e1f6f55854c3d5a818dfad819603@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2023-05-26 10:07, Kyotaro Horiguchi wrote:
Thanks for your review!

> + printf(_(" -x --strip-extension=EXT clean up files if they have
> this extension\n"));
>
> Perhaps a comma is needed after "-x".

That's an oversight. Modified it.

> The apparent inconsistency
> between the long name and the description is perplexing. I think it
> might be more suitable to reword the descrition to "ignore this
> extension while identifying files for cleanup" or something along
> those lines..., and then name the option as "--ignore-extension".

I used 'strip' since it is used in the manual as below:

| -x extension
| Provide an extension that will be stripped from all file names before
deciding if they should be deleted

I think using the same verb both in long name option and in the manual
is natural.
How about something like this?

| printf(_(" -x, --strip-extension=EXT strip this extention before
identifying files fo clean up\n"));

> The patch leaves the code.
>
>> * Truncation is essentially harmless, because we skip names of
>> * length other than XLOG_FNAME_LEN. (In principle, one could use
>> * a 1000-character additional_ext and get trouble.)
>> */
>> strlcpy(walfile, xlde->d_name, MAXPGPATH);
>> TrimExtension(walfile, additional_ext);
>
> The comment is no longer correct about the file name length.

Yeah.
considering parital WAL, it would be not correct even before applying
the patch.
I modifiied the comments as below:

| * Truncation is essentially harmless, because we check the file
| * format including the length immediately after this.

> + if (!IsXLogFileName(walfile) && !IsPartialXLogFileName(walfile) &&
> + (!cleanBackupHistory || !IsBackupHistoryFileName(walfile)))
> + continue;
>
> I'm not certain about the others, but I see this a tad tricky to
> understand at first glance. Here's how I would phrase it. (A nearby
> comment might require a tweak if we go ahead with this change.)
>
> if (!(IsXLogFileName(walfile) || IsPartialXLogFileName(walfile) ||
> (cleanBackupHistory && IsBackupHistoryFileName(walfile))))
>
> or
>
> if (!IsXLogFileName(walfile) && !IsPartialXLogFileName(walfile) &&
> !(cleanBackupHistory && IsBackupHistoryFileName(walfile)))

Thanks for the suggestion, I used the second one.

> By the way, this patch reduces the nesting level. If we go that
> direction, the following structure could be reworked similarly, and I
> believe it results in a more standard structure.
>
> if ((xldir = opendir(archiveLocation)) != NULL)
> {
> ...
> }
> else
> pg_fatal("could not open archive location..

Agreed. Attached 0003 patch for this.

On 2023-05-26 14:19, Michael Paquier wrote:
> On Thu, May 25, 2023 at 11:51:18PM +0900, torikoshia wrote:
>> Updated patches according to your comment.
>
> - ok(!-f "$tempdir/$walfiles[1]",
> - "$test_name: second older WAL file was cleaned up");
> - ok(-f "$tempdir/$walfiles[2]",
> + ok(!-f "$tempdir/@$walfiles[1]",
> + "$test_name: second older WAL/backup history file was cleaned
> up");
> + ok(-f "$tempdir/@$walfiles[2]",
>
> This is still a bit confusing, because as designed the test has a
> dependency on the number of elements present in the list, and the
> description of the test may not refer to what's actually used (the
> second element of each list is either a WAL segment or a backup
> history file). I think that I would just rewrite that so as we have a
> list of WAL segments in an array with the expected result associated
> to each one of them. Basically, something like that:
> my @wal_segments = (
> { name => "SEGMENT1", present => 0 },
> { name => "BACKUPFILE1", present => 1 },
> { name => "SEGMENT2", present => 0 });
>
> Then the last part of run_check() would loop through all the elements
> listed.

Thanks.
Update the patch according to the advice.
I also changed the parameter of run_check() from specifying extension of
oldestkeptwalfile to oldestkeptwalfile itself.

--
Regards,

--
Atsushi Torikoshi
NTT DATA CORPORATION

Attachment Content-Type Size
v5-0001-Introduce-pg_archivecleanup-into-getopt_long.patch text/x-diff 3.9 KB
v5-0002-Allow-pg_archivecleanup-to-remove-backup-history-.patch text/x-diff 10.8 KB
v5-0003-Refactored-to-reduce-the-nesting-level.patch text/x-diff 5.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Richard Guo 2023-05-31 02:47:03 Re: ERROR: wrong varnullingrels (b 3) (expected (b)) for Var 2/1
Previous Message David Rowley 2023-05-31 00:59:23 Re: Fix incorrect start up costs for WindowAgg paths (bug #17862)