Re: Refactor backup related code (was: Is it correct to say, "invalid data in file \"%s\"", BACKUP_LABEL_FILE in do_pg_backup_stop?)

From: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
To: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, David Steele <david(at)pgmasters(dot)net>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Refactor backup related code (was: Is it correct to say, "invalid data in file \"%s\"", BACKUP_LABEL_FILE in do_pg_backup_stop?)
Date: 2022-09-18 07:53:57
Message-ID: CALj2ACX0wjo+49hbUmvc_zT1zwdqFOQyhorN0Ox-Rk6v97Nejw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Sep 18, 2022 at 7:38 AM Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> wrote:
>
> On 2022/09/17 16:18, Bharath Rupireddy wrote:
> > Good idea. It makes a lot more sense to me, because xlog.c is already
> > a file of 9000 LOC. I've created xlogbackup.c/.h files and added the
> > new code there. Once this patch gets in, I can offer my hand to move
> > do_pg_backup_start() and do_pg_backup_stop() from xlog.c and if okay,
> > pg_backup_start() and pg_backup_stop() from xlogfuncs.c to
> > xlogbackup.c/.h. Then, we might have to create new get/set APIs for
> > XLogCtl fields that do_pg_backup_start() and do_pg_backup_stop()
> > access.
>
> The definition of SessionBackupState enum type also should be in xlogbackup.h?

Correct. Basically, all the backup related code from xlog.c,
xlogfuncs.c and elsewhere can go to xlogbackup.c/.h. I will focus on
that refactoring patch once this gets in.

> > We need to carry tablespace_map contents from do_pg_backup_start()
> > till pg_backup_stop(), backup_label and history_file too are easy to
> > carry across. Hence it will be good to have all of them i.e.
> > tablespace_map, backup_label and history_file in the BackupState
> > structure. IMO, this way is good.
>
> backup_label and history_file are not carried between pg_backup_start()
> and _stop(), so don't need to be saved in BackupState. Their contents
> can be easily created from other saved fields in BackupState,
> if necessary. So at least for me it's better to get rid of them from
> BackupState and don't allocate TopMemoryContext memory for them.

Yeah, but they have to be carried from do_pg_backup_stop() to
pg_backup_stop() or callers and also instead of keeping tablespace_map
in BackupState and others elsewhere don't seem to be a good idea to
me. IMO, BackupState is a good place to contain all the information
that's carried across various functions. I've changed the code to
lazily (upon first use in the backend) allocate memory for all of them
as we're concerned of the memory allocation beforehand.

> > Yeah, I think that can be solved by passing in backup_state to
> > do_pg_abort_backup(). If okay, I can work on this too as 0002 patch in
> > this thread or we can discuss this separately to get more attention
> > after this refactoring patch gets in.
>
> Or, to avoid such memory bloat, how about allocating the memory for
> backup_state only when it's NULL?

Ah my bad, I missed that. Done now.

> > I'm attaching v5 patch with the above review comments addressed,
> > please review it further.
>
> Thanks for updating the patch!

Thanks for reviewing it.

> + char startxlogfile[MAXFNAMELEN_BACKUP]; /* backup start WAL file */
> <snip>
> + char stopxlogfile[MAXFNAMELEN_BACKUP]; /* backup stop WAL file */
>
> These file names seem not necessary in BackupState because they can be
> calculated from other fields like startpoint and starttli, etc when
> making backup_label and history file contents. If we remove them from
> BackupState, we can also remove the definition of MAXFNAMELEN_BACKUP
> macro from xlogbackup.h.

Done.

> + /* construct backup_label contents */
> + build_backup_content(state, false);
>
> In basebackup case, build_backup_content() is called unnecessary twice
> because do_pg_stop_backup() and its caller, perform_base_backup() call
> that. This makes me think that it's better to get rid of the call to
> build_backup_content() from do_pg_backup_stop(). Instead its callers,
> perform_base_backup() and pg_backup_stop() should call that.

Yeah, it's a good idea. Done that way. It's easier because we can
create backup_label file contents at any point of time after
pg_backup_start().

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment Content-Type Size
v6-0001-Refactor-backup-related-code.patch application/octet-stream 32.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bharath Rupireddy 2022-09-18 14:09:24 Re: Refactor backup related code (was: Is it correct to say, "invalid data in file \"%s\"", BACKUP_LABEL_FILE in do_pg_backup_stop?)
Previous Message Fujii Masao 2022-09-18 07:42:42 Re: is_superuser is not documented