Re: Refactor backup related code (was: Is it correct to say, "invalid data in file \"%s\"", BACKUP_LABEL_FILE in do_pg_backup_stop?)

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
Cc: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>, David Steele <david(at)pgmasters(dot)net>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Refactor backup related code (was: Is it correct to say, "invalid data in file \"%s\"", BACKUP_LABEL_FILE in do_pg_backup_stop?)
Date: 2022-09-20 01:50:37
Message-ID: YykcbQBuFzXI1x7m@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Sep 19, 2022 at 06:26:34PM +0530, Bharath Rupireddy wrote:
> On Mon, Sep 19, 2022 at 2:38 PM Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> wrote:
> Fixed. I believed that the regression tests cover pg_backup_start()
> and pg_backup_stop(), and relied on make check-world, surprisingly
> there's no test that covers these functions. Is it a good idea to add
> tests for these functions in misc_functions.sql or backup.sql or
> somewhere so that they get run as part of make check? Thoughts?

The main regression test suite should not include direct calls to
pg_backup_start() or pg_backup_stop() as these depend on wal_level,
and we spend a certain amount of resources in keeping the tests a
maximum portable across such configurations, wal_level = minimal being
one of them. One portable example is in 001_stream_rep.pl.

> That's a good idea. I'm marking a flag if the label name overflows (>
> MAXPGPATH), later using it in do_pg_backup_start() to error out. We
> could've thrown error directly, but that changes the behaviour - right
> now, first "
> wal_level must be set to \"replica\" or \"logical\" at server start."
> gets emitted and then label name overflow error - I don't want to do
> that.

- if (strlen(backupidstr) > MAXPGPATH)
+ if (state->name_overflowed == true)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("backup label too long (max %d bytes)",
It does not strike me as a huge issue to force a truncation of such
backup label names. 1024 is large enough for basically all users,
in my opinion. Or you could just truncate in the internal logic, but
still complain at MAXPGPATH - 1 as the last byte would be for the zero
termination. In short, there is no need to complicate things with
name_overflowed.

>> In v7 patch, since pg_backup_stop() calls build_backup_content(),
>> backup_label and history_file seem not to be carried from do_pg_backup_stop()
>> to pg_backup_stop(). This makes me still think that it's better not to include
>> them in BackupState...
>
> I'm a bit worried about the backup state being spread across if we
> separate out backup_label and history_file from BackupState and keep
> tablespace_map contents there. As I said upthread, we are not
> allocating memory for them at the beginning, we allocate only when
> needed. IMO, this code is readable and more extensible.

+ StringInfo backup_label; /* backup_label file contents */
+ StringInfo tablespace_map; /* tablespace_map file contents */
+ StringInfo history_file; /* history file contents */
IMV, repeating a point I already made once upthread, BackupState
should hold none of these. Let's just generate the contents of these
files in the contexts where they are needed, making BackupState
something to rely on to build them in the code paths where they are
necessary. This is going to make the reasoning around the memory
contexts where each one of them is stored much easier and reduce the
changes of bugs in the long-term.

> I've also taken help of the error callback mechanism to clean up the
> allocated memory in case of a failure. For do_pg_abort_backup() cases,
> I think we don't need to clean the memory because that gets called on
> proc exit (before_shmem_exit()).

Memory could still bloat while the process running the SQL functions
is running depending on the error code path, anyway.
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message kuroda.hayato@fujitsu.com 2022-09-20 02:05:05 RE: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher
Previous Message Michael Paquier 2022-09-20 01:33:24 Re: pg_create_logical_replication_slot argument incongruency