From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Asif Rehman <asifr(dot)rehman(at)gmail(dot)com> |
Cc: | Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: WIP/PoC for parallel backup |
Date: | 2019-10-28 15:28:49 |
Message-ID: | CA+TgmobkuqqDNuPiJQxYM5A2cB03x+WJ2EhDGLzxFzNcdCkPVQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Oct 28, 2019 at 10:03 AM Asif Rehman <asifr(dot)rehman(at)gmail(dot)com> wrote:
> I have updated the patch to include the changes suggested by Jeevan. This patch also implements the thread workers instead of
> processes and fetches a single file at a time. The tar format has been disabled for first version of parallel backup.
Looking at 0001-0003:
It's not clear to me what the purpose of the start WAL location is
supposed to be. As far as I can see, SendBackupFiles() stores it in a
variable which is then used for exactly nothing, and nothing else uses
it. It seems like that would be part of a potential incremental
backup feature, but I don't see what it's got to do with parallel full
backup.
The tablespace_path option appears entirely unused, and I don't know
why that should be necessary here, either.
STORE_BACKUPFILE() seems like maybe it should be a function rather
than a macro, and also probably be renamed, because it doesn't store
files and the argument's not necessarily a file.
SendBackupManifest() does not send a backup manifest in the sense
contemplated by the email thread on that subject. It sends a file
list. That seems like the right idea - IMHO, anyway - but you need to
do a thorough renaming.
I think it would be fine to decide that this facility won't support
exclusive-mode backup.
I don't think much of having both sendDir() and sendDir_(). The latter
name is inconsistent with any naming convention we have, and there
seems to be no reason not to just add an argument to sendDir() and
change the callers.
I think we should rename - perhaps as a preparatory patch - the
sizeonly flag to dryrun, or something like that.
The resource cleanup does not look right. You've included calls to
PG_ENSURE_ERROR_CLEANUP(base_backup_cleanup, 0) in both StartBackup()
and StopBackup(), but what happens if there is an error or even a
clean shutdown of the connection in between? I think that there needs
to be some change here to ensure that a walsender will always call
base_backup_cleanup() when it exits; I think that'd probably remove
the need for any PG_ENSURE_ERROR_CLEANUP calls at all, including ones
we have already. This might also be something that could be done as a
separate, prepatory refactoring patch.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2019-10-28 15:35:47 | Re: Proposition to use '==' as synonym for 'IS NOT DISTINCT FROM' |
Previous Message | Tomas Vondra | 2019-10-28 15:20:48 | Using multiple extended statistics for estimates |