Re: Allow pg_archivecleanup to remove backup history files

From: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
To: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Cc: michael(at)paquier(dot)xyz, masao(dot)fujii(at)oss(dot)nttdata(dot)com, bharath(dot)rupireddyforpostgres(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Allow pg_archivecleanup to remove backup history files
Date: 2023-06-23 08:37:09
Message-ID: 0fd031265a324696c2ac58c5d4b394ef@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2023-06-22 16:47, Kyotaro Horiguchi wrote:
Thanks for your review!

>
> 0002:
>
> + * Check file name.
> + *
> + * We skip files which are not WAL file or partial WAL file.
>
> There's no need to spend this many lines describing this, and it's
> suggested to avoid restating what the code does. I think a comment
> might not be necessary. But if we decide to have one, it could look
> like this:
>
>> /* we're only removing specific types of files */

As you mentioned, this comment is restatement of the codes.
Removed the comment.

>
> Other than that, it looks good to me.
>
>
> 0003:
> + <para>
> + Remove backup history files.
>
> I might be overthinking it, but I'd phrase it as, "Remove backup
> history files *as well*.". (The --help message describes the same
> thing using "including".)

Agreed.

> + For details about backup history file, please refer to the
> <xref linkend="backup-base-backup"/>.
>
> We usually write this as simple as "See <xref...> for details (of the
> backup history files)" or "Refer to <xref..> for more information
> (about the backup history files)." or such like... (I think)

Agreed. I used the former one.

> +bool cleanBackupHistory = false; /* remove files including
> + * backup history files */
>
> The indentation appears to be inconsistent.

Modified.

>
>
> - * Truncation is essentially harmless, because we skip names of
> - * length other than XLOG_FNAME_LEN. (In principle, one could use
> - * a 1000-character additional_ext and get trouble.)
> + * Truncation is essentially harmless, because we check the file
> + * format including the length immediately after this.
> + * (In principle, one could use a 1000-character additional_ext
> + * and get trouble.)
> */
> strlcpy(walfile, xlde->d_name, MAXPGPATH);
> TrimExtension(walfile, additional_ext);
>
> The revised comment seems to drift from the original point. Except for
> a potential exception by a too-long addition_ext, the original comment
> asserted that the name truncation was safe since it wouldn't affect
> the files we're removing. In other words, it asserted that the
> filenames to be removed won't be truncated and any actual truncation
> wouldn't lead to accidental deletions.
>
> Hence, I think we should adjust the comment to maintain its original
> point, and extend it to include backup history files. A possible
> revision could be (very simple):
>
>> * Truncation is essentially harmless, because we skip names of
>> length
>> * longer than the length of backup history file. (In principle, one
>> * could use a 1000-character additional_ext and get trouble.)

This is true, but we do stricter check for preventing accidental
deletion at the below code than just skipping "names of length longer
than the length of backup history file".

| if (!IsXLogFileName(walfile) && !IsPartialXLogFileName(walfile)
&&
| !(cleanBackupHistory && IsBackupHistoryFileName(walfile)))
| continue;

How about something like this?

| Truncation is essentially harmless, because we skip files whose format
is different from WAL files and backup history files.

> Regarding the TAP test, it checks that the --clean-backup-history does
> indeed remove backup history files. However, it doesn't check that
> this doesn't occur if the option isn't specified. Shouldn't we test
> for the latter scenario as well?

Agreed.
Added a backup history file to @walfiles_with_gz.

> +sub get_walfiles
> +{
> <snip..>
> +
> +create_files(get_walfiles(@walfiles_with_gz));
>
> The new function get_walfiels() just increases the line count without
> cutting any lines. The following changes are sufficient and easier to
> read (at least for me).
>
>> open my $file, '>', "$tempdir/$fn->{name}";
>
>> foreach my $fn (map {$_->{name}} @walfiles_with_gz)

Agreed.
Remove get_walfiles() and added some changes as above.

--
Regards,

--
Atsushi Torikoshi
NTT DATA CORPORATION

Attachment Content-Type Size
v12-0001-Introduce-pg_archivecleanup-into-getopt_long.patch text/x-diff 4.1 KB
v12-0002-Preliminary-refactoring-for-a-subsequent-patch.patch text/x-diff 5.6 KB
v12-0003-Allow-pg_archivecleanup-to-remove-backup-history.patch text/x-diff 8.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message jian he 2023-06-23 08:52:27 Re: Do we want a hashset type?
Previous Message Amit Kapila 2023-06-23 08:34:15 Re: Assert while autovacuum was executing