Re: WIP/PoC for parallel backup

From: Asif Rehman <asifr(dot)rehman(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP/PoC for parallel backup
Date: 2019-10-30 14:16:11
Message-ID: CADM=JeisXQ4ONsbeByC=4jEuL94WQz-B9cEVZ7CFpLnatvFJgQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Oct 28, 2019 at 8:29 PM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> On Mon, Oct 28, 2019 at 10:03 AM Asif Rehman <asifr(dot)rehman(at)gmail(dot)com>
> wrote:
> > I have updated the patch to include the changes suggested by Jeevan.
> This patch also implements the thread workers instead of
> > processes and fetches a single file at a time. The tar format has been
> disabled for first version of parallel backup.
>
> Looking at 0001-0003:
>
> It's not clear to me what the purpose of the start WAL location is
> supposed to be. As far as I can see, SendBackupFiles() stores it in a
> variable which is then used for exactly nothing, and nothing else uses
> it. It seems like that would be part of a potential incremental
> backup feature, but I don't see what it's got to do with parallel full
> backup.
>

'startptr' is used by sendFile() during checksum verification. Since
SendBackupFiles() is using sendFIle we have to set a valid WAL location.

> The tablespace_path option appears entirely unused, and I don't know
> why that should be necessary here, either.
>

This is to calculate the basepathlen. We need to exclude the tablespace
location (or
base path) from the filename before it is sent to the client with sendFile
call. I added
this option primarily to avoid performing string manipulation on filename
to extract the
tablespace location and then calculate the basepathlen.

Alternatively we can do it by extracting the base path from the received
filename. What
do you suggest?

>
> STORE_BACKUPFILE() seems like maybe it should be a function rather
> than a macro, and also probably be renamed, because it doesn't store
> files and the argument's not necessarily a file.
>
Sure.

>
> SendBackupManifest() does not send a backup manifest in the sense
> contemplated by the email thread on that subject. It sends a file
> list. That seems like the right idea - IMHO, anyway - but you need to
> do a thorough renaming.
>

I'm considering the following command names:
START_BACKUP
- Starts the backup process

SEND_BACKUP_FILELIST (Instead of SEND_BACKUP_MANIFEST)
- Sends the list of all files (along with file information such as
filename, file type (directory/file/link),
file size and file mtime for each file) to be backed up.

SEND_BACKUP_FILES
- Sends one or more files to the client.

STOP_BACKUP
- Stops the backup process.

I'll update the function names accordingly after your confirmation. Of
course, suggestions for
better names are welcome.

>
> I think it would be fine to decide that this facility won't support
> exclusive-mode backup.
>

Sure. Will drop this patch.

>
> I don't think much of having both sendDir() and sendDir_(). The latter
> name is inconsistent with any naming convention we have, and there
> seems to be no reason not to just add an argument to sendDir() and
> change the callers.

> I think we should rename - perhaps as a preparatory patch - the
> sizeonly flag to dryrun, or something like that.
>

Sure, will take care of it.

> The resource cleanup does not look right. You've included calls to
> PG_ENSURE_ERROR_CLEANUP(base_backup_cleanup, 0) in both StartBackup()
> and StopBackup(), but what happens if there is an error or even a
> clean shutdown of the connection in between? I think that there needs

to be some change here to ensure that a walsender will always call
> base_backup_cleanup() when it exits; I think that'd probably remove
> the need for any PG_ENSURE_ERROR_CLEANUP calls at all, including ones
> we have already. This might also be something that could be done as a
> separate, prepatory refactoring patch.
>

You're right. I didn't handle this case properly. I will removed
PG_ENSURE_ERROR_CLEANUP
calls and replace it with before_shmem_exit handler. This way
whenever backend process exits,
base_backup_cleanup will be called:
- If it exists before calling the do_pg_stop_backup, base_backup_cleanup
will take care of cleanup.
- otherwise in case of a clean shutdown (after calling do_pg_stop_backup)
then base_backup_cleanup
will simply return without doing anything.

--
Asif Rehman
Highgo Software (Canada/China/Pakistan)
URL : www.highgo.ca

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2019-10-30 15:30:52 Re: PL/Python fails on new NetBSD/PPC 8.0 install
Previous Message Tom Lane 2019-10-30 14:13:26 Re: Proposal: Global Index