[patch] reorder tablespaces in basebackup tar stream for backup_label

From: Michael Banck <michael(dot)banck(at)credativ(dot)de>
To: PostgreSQL Development <pgsql-hackers(at)postgresql(dot)org>
Subject: [patch] reorder tablespaces in basebackup tar stream for backup_label
Date: 2017-02-21 10:17:55
Message-ID: 1487672275.15458.15.camel@credativ.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

currently, the backup_label and (I think) the tablespace_map files are
(by design) conveniently located at the beginning of the main tablespace
tarball when making a basebackup. However, (by accident or also by
design?) the main tablespace is the last tarball[1] to be streamed via
the BASE_BACKUP replication protocol command.

For pg_basebackup, this is not a real problem, as either each tablespace
is its own tarfile (in tar format mode), or the files are extracted
anyway (in plain mode).

However, third party tools using the BASE_BACKUP command might want to
extract the backup_label, e.g. in order to figure out the START WAL
LOCATION. If they make a big tarball for the whole cluster potentially
including all external tablespaces, then the backup_label file is
somewhere in the middle of it and it takes a long time for tar to
extract it.

So I am proposing the attached patch, which sends the base tablespace
first, and then all the other external tablespaces afterwards, thus
having base_backup be the first file in the tar in all cases. Does
anybody see a problem with that?

Michael

[1] Chapter 52.3 of the documentation says "one or more CopyResponse
results will be sent, one for the main data directory and one for each
additional tablespace other than pg_default and pg_global.", which makes
it sound like the main data directory is first, but in my testing, this
is not the case.

--
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax: +49 2166 9901-100
Email: michael(dot)banck(at)credativ(dot)de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer

Attachment Content-Type Size
basebackup_backup_label_position.patch text/x-patch 723 bytes

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Rushabh Lathia 2017-02-21 10:31:54 Re: Push down more UPDATEs/DELETEs in postgres_fdw
Previous Message Christoph Berg 2017-02-21 10:05:53 Re: powerpc(32) point/polygon regression failures on Debian Jessie