Re: WIP/PoC for parallel backup

From: Asif Rehman <asifr(dot)rehman(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Rushabh Lathia <rushabh(dot)lathia(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP/PoC for parallel backup
Date: 2019-10-16 13:19:10
Message-ID: CADM=JegetC5yCUUX8tnm=6n_PKTOptRZAE0y2oR3452Lvd9VuQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Oct 7, 2019 at 6:35 PM Asif Rehman <asifr(dot)rehman(at)gmail(dot)com> wrote:

>
>
> On Mon, Oct 7, 2019 at 6:05 PM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
>> On Mon, Oct 7, 2019 at 8:48 AM Asif Rehman <asifr(dot)rehman(at)gmail(dot)com>
>> wrote:
>> > Sure. Though the backup manifest patch calculates and includes the
>> checksum of backup files and is done
>> > while the file is being transferred to the frontend-end. The manifest
>> file itself is copied at the
>> > very end of the backup. In parallel backup, I need the list of
>> filenames before file contents are transferred, in
>> > order to divide them into multiple workers. For that, the manifest file
>> has to be available when START_BACKUP
>> > is called.
>> >
>> > That means, backup manifest should support its creation while excluding
>> the checksum during START_BACKUP().
>> > I also need the directory information as well for two reasons:
>> >
>> > - In plain format, base path has to exist before we can write the file.
>> we can extract the base path from the file
>> > but doing that for all files does not seem a good idea.
>> > - base backup does not include the content of some directories but
>> those directories although empty, are still
>> > expected in PGDATA.
>> >
>> > I can make these changes part of parallel backup (which would be on top
>> of backup manifest patch) or
>> > these changes can be done as part of manifest patch and then parallel
>> can use them.
>> >
>> > Robert what do you suggest?
>>
>> I think we should probably not use backup manifests here, actually. I
>> initially thought that would be a good idea, but after further thought
>> it seems like it just complicates the code to no real benefit.
>
>
> Okay.
>
>
>> I
>> suggest that the START_BACKUP command just return a result set, like a
>> query, with perhaps four columns: file name, file type ('d' for
>> directory or 'f' for file), file size, file mtime. pg_basebackup will
>> ignore the mtime, but some other tools might find that useful
>> information.
>>
> yes current patch already returns the result set. will add the additional
> information.
>
>
>> I wonder if we should also split START_BACKUP (which should enter
>> non-exclusive backup mode) from GET_FILE_LIST, in case some other
>> client program wants to use one of those but not the other. I think
>> that's probably a good idea, but not sure.
>>
>
> Currently pg_basebackup does not enter in exclusive backup mode and other
> tools have to
> use pg_start_backup() and pg_stop_backup() functions to achieve that.
> Since we are breaking
> backup into multiple command, I believe it would be a good idea to have
> this option. I will include
> it in next revision of this patch.
>
>
>>
>> I still think that the files should be requested one at a time, not a
>> huge long list in a single command.
>>
> sure, will make the change.
>
>
>

I have refactored the functionality into multiple smaller patches in order
to make the review process easier. I have divided the code into backend
changes and pg_basebackup changes. The
backend replication system now supports the following commands:

- START_BACKUP
- SEND_FILE_LIST
- SEND_FILES_CONTENT
- STOP_BACKUP

The START_BACKUP will not return the list of files, instead SEND_FILE_LIST
is used for that. The START_BACKUP
now calls pg_start_backup and returns starting WAL position, tablespace
header information and content of backup label file.
Initially I was using tmp files to store the backup_label content but that
turns out to be bad idea, because there can be multiple
non-exclusive backups running. The backup label information is needed by
stop_backup so pg_basebackup will send it as part
of STOP_BACKUP.

The SEND_FILE_LIST will return the list of files. It will be returned as
resultset having four columns (filename, type, size, mtime).
The SEND_FILES_CONTENT can now return the single file or multiple files as
required. There is not much change required to
support both, so I believe it will be much useable this way if other tools
want to utilise it.

As per suggestion from Robert, I am currently working on making changes in
pg_basebackup to fetch files one by one. However that's not complete and
the attach patch
is still using the old method of multi-file fetching to test the backend
commands. I will send an updated patch which will contain the changes on
fetching file one by one.

I wanted to share the backend patch to get some feedback in the mean time.

Thanks,

--
Asif Rehman
Highgo Software (Canada/China/Pakistan)
URL : www.highgo.ca

Attachment Content-Type Size
0001-Refactor-some-basebackup-code-to-increase-reusabilit.patch application/octet-stream 23.6 KB
0004-parallel-backup-testcase.patch application/octet-stream 19.8 KB
0003-pg_basebackup-changes-for-parallel-backup.patch application/octet-stream 20.9 KB
0002-backend-changes-for-parallel-backup.patch application/octet-stream 25.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2019-10-16 13:26:42 Re: BUG #16045: vacuum_db crash and illegal memory alloc after pg_upgrade from PG11 to PG12
Previous Message Alvaro Herrera 2019-10-16 12:53:56 Re: v12.0: segfault in reindex CONCURRENTLY