Re: PATCH: Exclude additional directories in pg_basebackup

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: David Steele <david(at)pgmasters(dot)net>
Cc: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>
Subject: Re: PATCH: Exclude additional directories in pg_basebackup
Date: 2016-09-07 02:25:30
Message-ID: CAB7nPqQNQ-5PWEy-5c6x1gei99mQM+L+Dt7oG7SHb=uDbWGRmQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Sep 7, 2016 at 12:16 AM, David Steele <david(at)pgmasters(dot)net> wrote:
> On 9/1/16 9:53 AM, Peter Eisentraut wrote:
>>
>> On 8/15/16 3:39 PM, David Steele wrote:
>>>
>>> That patch got me thinking about what else could be excluded and after
>>> some investigation I found the following: pg_notify, pg_serial,
>>> pg_snapshots, pg_subtrans. These directories are all cleaned, zeroed,
>>> or rebuilt on server start.
>>>
>>> The attached patch adds these directories to the pg_basebackup
>>> exclusions and takes an array-based approach to excluding directories
>>> and files during backup.
>>
>>
>> We do support other backup methods besides using pg_basebackup. So I
>> think we need to document the required or recommended handling of each
>> of these directories. And then pg_basebackup should become a consumer
>> of that documentation.
>>
>> The current documentation on this is at
>>
>> <https://www.postgresql.org/docs/devel/static/continuous-archiving.html#BACKUP-LOWLEVEL-BASE-BACKUP-DATA>,
>> which covers pg_xlog and pg_replslot. I think that documentation should
>> be expanded, maybe with a simple list that is easy to copy into an
>> exclude file, following by more detail on each directory.
>
>
> Attached is a new patch that adds sgml documentation. I can expand on each
> directory individually if you think that's necessary, but thought it was
> better to lump them into a few categories.

+ be ommitted from the backup as they will be initialized on postmaster
+ startup. If the <xref linkend="GUC-STATS-TEMP-DIRECTORY"> is set and is
+ under the database cluster directory then the contents of the directory
+ specified by <xref linkend="GUC-STATS-TEMP-DIRECTORY"> can also
be ommitted.

s/ommitted/omitted/

+#define EXCLUDE_DIR_MAX 8
+#define EXCLUDE_DIR_STAT_TMP 0
+
+const char *excludeDirContents[EXCLUDE_DIR_MAX] =
+{
+ /*
+ * Skip temporary statistics files. The first array position will be
+ * filled with the value of pgstat_stat_directory relative to PGDATA.
+ * PG_STAT_TMP_DIR must be skipped even when stats_temp_directory is set
+ * because PGSS_TEXT_FILE is always created there.
+ */
+ NULL,
I find that ugly. I'd rather use an array with undefined size for the
fixed elements finishing by NULL, remove EXCLUDE_DIR_MAX and
EXCLUDE_DIR_STAT_TMP and use a small routine to do the work done on
_tarWriteHeader...
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Noah Misch 2016-09-07 02:54:08 Re: Suggestions for first contribution?
Previous Message Craig Ringer 2016-09-07 02:01:59 Re: [PATCH] COPY vs \copy HINT