Re: pg_basebackup vs. Windows and tablespaces

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Dilip kumar <dilip(dot)kumar(at)huawei(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Magnus Hagander <magnus(at)hagander(dot)net>, Noah Misch <noah(at)leadboat(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_basebackup vs. Windows and tablespaces
Date: 2014-12-14 06:24:13
Message-ID: CAA4eK1JytXuugsxtLajE6=__xa+hQKChi5U-FvduKEJDm_oeAA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Dec 13, 2014 at 10:48 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
> > On 11/20/2014 02:27 AM, Amit Kapila wrote:
> >> Now the part where I would like to receive feedback before revising the
> >> patch is on the coding style. It seems to me from Tom's comments that
> >> he is not happy with the code, now I am not sure which part of the
patch
> >> he thinks needs change. Tom if possible, could you be slightly more
> >> specific about your concern w.r.t code?
>
> > In view of the request above for comments from Tom, I have moved this
> > back to "Needs Review".
>
> Sorry, I was not paying very close attention to this thread and missed
> the request for comments. A few such:
>
> 1. The patch is completely naive about what might be in the symlink
> path string; eg embedded spaces in the path would break it. On at
> least some platforms, newlines could be in the path as well. I'm not
> sure about how to guard against this while maintaining human readability
> of the file.
>

I will look into this and see what best can be done.

> 2. There seems to be more going on here than what is advertised, eg
> why do we need to add an option to the BASE_BACKUP command

This is to ensure that symlink file is generated only for tar format;
server is not aware of whether the backup is generated for plain format
or tar format. We don't want to do it for plain format as for that
client (pg_basebackup) can update the symlinks via -T option and backing
up symlink file during that operation can lead to spurious symlinks after
archive recovery. I have given the reason why we want to accomplish it
only for tar format in my initial mail.

> (and if
> we do need it, doesn't it need to be documented in protocol.sgml)?

I shall take care of it in next version of patch.

> And why is the RelationCacheInitFileRemove call relocated?
>

Because it assumes that tablespace directory pg_tblspc is in
place and it tries to remove the files by reading pg_tblspc
directory as well. Now as we setup the symlinks in pg_tblspc
after reading symlink file, so we should remove relcache init
file once the symlinks are setup in pg_tblspc directory.

> 3. Not terribly happy with the changes made to the API of
> do_pg_start_backup, eg having to be able to parse "DIR *" in its
> arguments seems like a lot of #include creep. xlog.h is pretty
> central so I'm not happy about plastering more #includes in it.
>

The reason of adding new include in xlog.c is for use of tablespaceinfo
structure which I have now kept in basebackup.h.

The reason why I have done this way is because do_pg_start_backup has
some functionality common to both non-exclusive and exclusive backups and
for this feature we have to do some work common for both non-exclusive
and exclusive backup which is to generate the symlink label file for
non-exclusive backups and write the symlink label file for exclusive
backups using that information. Doing this way seems right to me
as we are already doing something like that for backup label file.

Another possible way could be to write a new function in xlogutils.c
to do the symlink label stuff and then use the same in xlog.c, I think
that way we could avoid any new include in xlog.c. However for this we
need to have include in xlogutils.c to make it aware of tablespaceinfo
structure.

> 4. In the same vein, publicly declaring a struct with a name as
> generic as "tablespaceinfo" doesn't seem like a great idea, when
> its usage is far from generic.
>

This is related to above point, we need to use this for both
non-exclusive and exclusive backups and the work for exclusive
backups is done outside of basebackup.c due to which we need
to expose the same.

Any other better idea to address points 3 and 4?

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2014-12-14 08:03:29 Confusing comment in tidbitmap.c
Previous Message Mark Kirkwood 2014-12-14 06:12:36 Re: Commitfest problems