design for parallel backup

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: design for parallel backup
Date: 2020-04-15 15:57:29
Message-ID: CA+TgmoZubLXYR+Pd_gi3MVgyv5hQdLm-GBrVXkun-Lewaw12Kg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

Over at http://postgr.es/m/CADM=JehKgobEknb+_nab9179HzGj=9EiTzWMOd2mpqr_rifm0Q@mail.gmail.com
there's a proposal for a parallel backup patch which works in the way
that I have always thought parallel backup would work: instead of
having a monolithic command that returns a series of tarballs, you
request individual files from a pool of workers. Leaving aside the
quality-of-implementation issues in that patch set, I'm starting to
think that the design is fundamentally wrong and that we should take a
whole different approach. The problem I see is that it makes a
parallel backup and a non-parallel backup work very differently, and
I'm starting to realize that there are good reasons why you might want
them to be similar.

Specifically, as Andres recently pointed out[1], almost anything that
you might want to do on the client side, you might also want to do on
the server side. We already have an option to let the client compress
each tarball, but you might also want the server to, say, compress
each tarball[2]. Similarly, you might want either the client or the
server to be able to encrypt each tarball, or compress but with a
different compression algorithm than gzip. If, as is presently the
case, the server is always returning a set of tarballs, it's pretty
easy to see how to make this work in the same way on either the client
or the server, but if the server returns a set of tarballs in
non-parallel backup cases, and a set of tarballs in parallel backup
cases, it's a lot harder to see how that any sort of server-side
processing should work, or how the same mechanism could be used on
either the client side or the server side.

So, my new idea for parallel backup is that the server will return
tarballs, but just more of them. Right now, you get base.tar and
${tablespace_oid}.tar for each tablespace. I propose that if you do a
parallel backup, you should get base-${N}.tar and
${tablespace_oid}-${N}.tar for some or all values of N between 1 and
the number of workers, with the server deciding which files ought to
go in which tarballs. This is more or less the naming convention that
BART uses for its parallel backup implementation, which, incidentally,
I did not write. I don't really care if we pick something else, but it
seems like a sensible choice. The reason why I say "some or all" is
that some workers might not get any of the data for a given
tablespace. In fact, it's probably desirable to have different workers
work on different tablespaces as far as possible, to maximize parallel
I/O, but it's quite likely that you will have more workers than
tablespaces. So you might end up, with pg_basebackup -j4, having the
server send you base-1.tar and base-2.tar and base-4.tar, but not
base-3.tar, because worker 3 spent all of its time on user-defined
tablespaces, or was just out to lunch.

Now, if you use -Fp, those tar files are just going to get extracted
anyway by pg_basebackup itself, so you won't even know they exist.
However, if you use -Ft, you're going to end up with more files than
before. This seems like something of a wart, because you wouldn't
necessarily expect that the set of output files produced by a backup
would depend on the degree of parallelism used to take it. However,
I'm not sure I see a reasonable alternative. The client could try to
glue all of the related tar files sent by the server together into one
big tarfile, but that seems like it would slow down the process of
writing the backup by forcing the different server connections to
compete for the right to write to the same file. Moreover, if you end
up needing to restore the backup, having a bunch of smaller tar files
instead of one big one means you can try to untar them in parallel if
you like, so it seems not impossible that it could be advantageous to
have them split in that case as well.

Thoughts?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

[1] http://postgr.es/m/20200412191702.ul7ohgv5gus3tsvo@alap3.anarazel.de
[2] https://www.postgresql.org/message-id/20190823172637.GA16436%40tamriel.snowman.net

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2020-04-15 16:03:28 Re: documenting the backup manifest file format
Previous Message Pavel Stehule 2020-04-15 15:53:54 Re: Poll: are people okay with function/operator table redesign?