Re: Refactor backup related code (was: Is it correct to say, "invalid data in file \"%s\"", BACKUP_LABEL_FILE in do_pg_backup_stop?)

From: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
To: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, David Steele <david(at)pgmasters(dot)net>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Refactor backup related code (was: Is it correct to say, "invalid data in file \"%s\"", BACKUP_LABEL_FILE in do_pg_backup_stop?)
Date: 2022-09-19 12:56:34
Message-ID: CALj2ACVyRkdBpcXpYNOu58QtiqDH6f7-N3Omkx27-Ees=ZQuFw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Sep 19, 2022 at 2:38 PM Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> wrote:
>
> Thanks for updating the patch!
>
> =# SELECT * FROM pg_backup_start('test', true);
> =# SELECT * FROM pg_backup_stop();
> LOG: server process (PID 15651) was terminated by signal 11: Segmentation fault: 11
> DETAIL: Failed process was running: SELECT * FROM pg_backup_stop();
>
> With v7 patch, pg_backup_stop() caused the segmentation fault.

Fixed. I believed that the regression tests cover pg_backup_start()
and pg_backup_stop(), and relied on make check-world, surprisingly
there's no test that covers these functions. Is it a good idea to add
tests for these functions in misc_functions.sql or backup.sql or
somewhere so that they get run as part of make check? Thoughts?

> =# SELECT * FROM pg_backup_start(repeat('test', 1024));
> ERROR: backup label too long (max 1024 bytes)
> STATEMENT: SELECT * FROM pg_backup_start(repeat('test', 1024));
>
> =# SELECT * FROM pg_backup_start(repeat('testtest', 1024));
> LOG: server process (PID 15844) was terminated by signal 11: Segmentation fault: 11
> DETAIL: Failed process was running: SELECT * FROM pg_backup_start(repeat('testtest', 1024));
>
> When I specified longer backup label in the second run of pg_backup_start()
> after the first run failed, it caused the segmentation fault.
>
>
> + state = (BackupState *) palloc0(sizeof(BackupState));
> + state->name = pstrdup(name);
>
> pg_backup_start() calls allocate_backup_state() and allocates the memory for
> the input backup label before checking its length in do_pg_backup_start().
> This can cause the memory for backup label to be allocated too much
> unnecessary. I think that the maximum length of BackupState.name should
> be MAXPGPATH (i.e., maximum allowed length for backup label).

That's a good idea. I'm marking a flag if the label name overflows (>
MAXPGPATH), later using it in do_pg_backup_start() to error out. We
could've thrown error directly, but that changes the behaviour - right
now, first "
wal_level must be set to \"replica\" or \"logical\" at server start."
gets emitted and then label name overflow error - I don't want to do
that.

> > Yeah, but they have to be carried from do_pg_backup_stop() to
> > pg_backup_stop() or callers and also instead of keeping tablespace_map
> > in BackupState and others elsewhere don't seem to be a good idea to
> > me. IMO, BackupState is a good place to contain all the information
> > that's carried across various functions.
>
> In v7 patch, since pg_backup_stop() calls build_backup_content(),
> backup_label and history_file seem not to be carried from do_pg_backup_stop()
> to pg_backup_stop(). This makes me still think that it's better not to include
> them in BackupState...

I'm a bit worried about the backup state being spread across if we
separate out backup_label and history_file from BackupState and keep
tablespace_map contents there. As I said upthread, we are not
allocating memory for them at the beginning, we allocate only when
needed. IMO, this code is readable and more extensible.

I've also taken help of the error callback mechanism to clean up the
allocated memory in case of a failure. For do_pg_abort_backup() cases,
I think we don't need to clean the memory because that gets called on
proc exit (before_shmem_exit()).

Please review the v8 patch further.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment Content-Type Size
v8-0001-Refactor-backup-related-code.patch application/octet-stream 35.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bharath Rupireddy 2022-09-19 13:16:17 pgstat: stats added in ReadPageInternal() aren't getting reported via pg_stat_wal
Previous Message Ashutosh Bapat 2022-09-19 11:54:11 Re: confirmed_flush_lsn shows LSN of the data that has not yet been received by the logical subscriber.